ranahanocka / MeshCNN

Convolutional Neural Network for 3D meshes in PyTorch
MIT License
1.61k stars 320 forks source link

ValueError: min() arg is an empty sequence #46

Open claell opened 4 years ago

claell commented 4 years ago

@ranahanocka First of all thanks for this interesting approach of CNN.

Unfortunately I am currently running into this error when trying to train the network on my own dataset. Faces were simplified to 600 with the blender script. Do you have an idea what this could possible causing this? Bad .obj files? Wrong parameters used for training configuration?

Console output:

saving the latest model (epoch 1, total_steps 16)
(epoch: 1, iters: 80, time: 0.045, data: 3.654) loss: 0.672 
Traceback (most recent call last):
  File "train.py", line 23, in <module>
    for i, data in enumerate(dataset):
  File "/kaggle/working/MeshCNN/data/__init__.py", line 33, in __iter__
    for i, data in enumerate(self.dataloader):
  File "/opt/conda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 819, in __next__
    return self._process_data(data)
  File "/opt/conda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
    data.reraise()
  File "/opt/conda/lib/python3.6/site-packages/torch/_utils.py", line 385, in reraise
    raise self.exc_type(msg)
ValueError: Caught ValueError in DataLoader worker process 2.
Original Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/opt/conda/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/opt/conda/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/kaggle/working/MeshCNN/data/classification_data.py", line 27, in __getitem__
    mesh = Mesh(file=path, opt=self.opt, hold_history=False, export_folder=self.opt.export_folder)
  File "/kaggle/working/MeshCNN/models/layers/mesh.py", line 16, in __init__
    fill_mesh(self, file, opt)
  File "/kaggle/working/MeshCNN/models/layers/mesh_prepare.py", line 11, in fill_mesh
    mesh_data = from_scratch(file, opt)
  File "/kaggle/working/MeshCNN/models/layers/mesh_prepare.py", line 61, in from_scratch
    post_augmentation(mesh_data, opt)
  File "/kaggle/working/MeshCNN/models/layers/mesh_prepare.py", line 185, in post_augmentation
    slide_verts(mesh, opt.slide_verts)
  File "/kaggle/working/MeshCNN/models/layers/mesh_prepare.py", line 198, in slide_verts
    if min(dihedral[edges]) > 2.65:
ValueError: min() arg is an empty sequence
ranahanocka commented 4 years ago

Hi @claell ,

For some reason, the code is complaining that the meshes are empty (i.e., edges is an empty sequence).

Is it possible your mesh contains all non-manifold geometry? Also, did you modify the ninput_edges and pooling resolution according to your new meshes? Also be sure to read the Code Options wiki if you didn't already

claell commented 4 years ago

Ok. I managed to workaround this with setting slide_verts to 0, but obviously it would be nice to further track this down.

When running the blender script I got #48, maybe that causes the problem? Also I get sometimes #49.

I am not that experienced with mesh files, how can I further analyze them to see whether the geometry might be non-manifold or empty?

I modified the ninput_edges and pooling resolution as described in https://github.com/ranahanocka/MeshCNN/issues/40#issuecomment-559518905.

ranahanocka commented 4 years ago

Hi @claell ,

I am not sure why slide_verts cause this problem (it simply perturbs the vertex location slightly). You can view the .obj files in meshlab, and check for non-manifoldness / boundary edges or other weird stuff using meshlab as well.

image

claell commented 4 years ago

@ranahanocka in the meantime I checked the .obj files in FreeCAD for errors. At least there were no errors found (for some random samples I checked). I could also try and check with Meshlab again.

After further thinking about the problems I have with MeshCNN, it might have to do with the number of faces/edges of the files I use as input. Some have a very simple geometry (like cubes) thus resulting in a face count of about 70. Other more complex files get normalized to 600 faces roughly with the script, but it won't upsample the very easy geometries to 600 faces. Could this lead to problems?

ranahanocka commented 4 years ago

Hi @claell ,

Yes -- I recommend using subdivision to upsample the number of faces. The idea is that the network should see 3D models of roughly the same resolution. Sort of like if you want to train a network to classify or segment images, you can rescale them to be roughly the same size. If there are small differences in resolution, it should still work with no problems. However, I expect it to be more challenging to train a single network which can generalize on very small and very large meshes.

I think the blender_process script should do that (like if you give it a mesh of 70 faces, and the target is 600 it should upsample it). You should try that to see if it works. If after running the script once the mesh still doesn't have enough faces, you can always run it through the script again.

claell commented 4 years ago

Hi @ranahanocka thanks for your help.

However, I expect it to be more challenging to train a single network which can generalize on very small and very large meshes.

This sounds like it is recommended, but should not result in errors during training, correct?

I think the blender_process script should do that

Then the 70 faces bodies probably had even less faces before and the script just did not manage to upsample it enough. I will try running it several times over the files (I juse it within a batch script that loops over all files within a folder).

I also did some experiments with the arguments for MeshCNN. After setting the batch size to 1, it did run for some samples and then gave different errors (like this one) every time. So it seems some samples are causing different kinds of problems. Also I noticed that I had to set the number of input edges to at least 2000. This seems wrong, since the input files should only have 600 faces with about 1000 edges.

claell commented 4 years ago

The need to set the number of input edges to something around 2000 in order to generate mean / std #18 seems to be independent to the number of faces. I had the same problem with files normalized to 300 faces as also with some normalized to 600. I am now mainly stuck at #40.

claell commented 4 years ago

I just sent you a mail with the dataset I am trying to use. Hopefully that will shed some more light onto my issues.

ranahanocka commented 4 years ago

Hi @claell --

I just took a look a couple meshes you sent. Some of them have very high genus -- (meaning a lot of holes). High genus shapes are more difficult to simplify to a small number of edges -- so I suggest reducing the amount of pooling to a really small amount. It seems your meshes have at most 900 edges, right? In that case, for sanity checking you can run:

--pool_res 900 900 900

which will not pool at all. then you can try to run something like:

--pool_res 870 870 870

which will pool once to 870, and then no more. You can try playing with these numbers, gradually increasing pooling and also until you find a config you like.

claell commented 4 years ago

Hi @ranahanocka

Thanks for having a look! Seems I had some misunderstanding with the pool_res parameters. I thought the parameters were indicating the face number, but it turns out to be the edge amount. That in combination with the many holes in some shapes helps to explain some errors I got.

Since somhow I have to set ninput_edges to at least 2000, I also had to do that for pool_res apparently. So I started with 2000 2000 2000 and did some experimenting. Now I am at 2000 1900 1800 1700, which seems to work.

However it is still weird and might also not be good that I have to set ninput_edges to 2000. All shapes should have below 900 edges. Do you have a clue what might be causing this?

Also slide_verts still seems to fail in some cases during learning, so I had to set it to 0 also.

Here is the complete command used, much was derived from the shrek_16 dataset example: !python /kaggle/working/MeshCNN/train.py --dataroot "/kaggle/working/MeshCNNInput" --name "MeshCNNInput" --batch_size 16 --ncf 64 128 256 --pool_res 2000 1900 1800 1700 --niter_decay 100 --ninput_edges 2000 --norm group --num_aug 2 --num_groups 2 --resblocks 1 --flip_edges 0.2

ranahanocka commented 4 years ago

Hi @claell ,

However it is still weird and might also not be good that I have to set ninput_edges to 2000. All shapes should have below 900 edges. Do you have a clue what might be causing this?

ninput_edges must be the maximum number edges in your entire set. Possibly there is one mesh with more edges? Maybe run a simple script which will iterate through each mesh and find the one with the max len(mesh.edges).

Also slide_verts still seems to fail in some cases during learning, so I had to set it to 0 also.

sometimes there are issues with the cached dataset, especially if you made changes. I suggest deleting those .npz cached files and trying to re-run with slide verts.

claell commented 4 years ago

Hi @ranahanocka

There should not be any meshes with more edges, since all got simplified with the blender script. However, I will try to do some more investigation in this field like running only with a few manually checked meshes or by creating some little script for that.

Regarding slide_verts, you might be correct. Altough I am running on kaggle, so everytime I restart the machine, everything gets deleted. However, this could also have been caused when trying to change some parameters to make other things working. I will keep an eye on it, whether it causes errors again.

claell commented 4 years ago

Hi @ranahanocka

I had some other stuff to do in the meantime, but recently I had time to revisit this. It turns out that there was one file that had around 2000 edges. So your guess was correct.

I wrote a small script for FreeCAD to identify those files that do not get simplifed enough by the blender script.

Here is a small excerpt for anyone who might be interested:

import FreeCAD
import FreeCADGui
import Mesh

edgeNumber = 900

print("Start.")
#Looping over directory
for entry in os.scandir(inputDir):
    importPath = entry.path
    #print(importPath)
    if ".obj" in importPath:
        mesh = Mesh.Mesh()
        mesh.read(importPath)
        facets = mesh.CountFacets
        edges = mesh.CountEdges
        if edges < edgeNumber - 10 or edges > edgeNumber + 10:
            print(importPath + " - Edges: " + str(edges) + " - Facets: " + str(facets))
            FreeCADGui.updateGui()
print("Done.")
JoeLambourne commented 4 years ago

@claell,

I also had the problem you reported with

 if min(dihedral[edges]) > 2.65:

After debugging of the code I found that had accidentally generated OBJ files which included some vertices which were not used by any triangles in the file. You might be having a similar issue here.

I have a short function which can be used to remove any unused vertices as MeshCNN loads the mesh.

def remove_unused_vertices(vs, faces):
    # Find which vertices are used
    is_vertex_used = [ False for v in vs]
    for face in faces:
        for vid in face:
            is_vertex_used[vid] = True

    # Find the new vertex array and vertex id map
    used_vertices = []
    vertex_id_map = []
    for i, used in enumerate(is_vertex_used):
        if used:
            new_vid = len(used_vertices)
            used_vertices.append(vs[i])
            vertex_id_map.append(new_vid)
        else:
            vertex_id_map.append(-1)

    # Re-map the faces
    remapped_faces = []
    for face in faces:
        mapped_face = []
        for vid in face:
            mapped_vid = vertex_id_map[vid]
            assert mapped_vid >= 0
            mapped_face.append(mapped_vid)
        remapped_faces.append(mapped_face)

    return used_vertices, remapped_faces

In I was able to call this function from the fill_from_file() in mesh_prepare.py. It appears to clean up the problem for me.

claell commented 4 years ago

@JoeLambourne thanks for your feedback. I am not sure, whether we have the same root of the problem, though. Did you also use the blender script to simplify the meshes to a given number of faces? Also, did you check, whether your obj files are recognized as intact when checking with a tool like meshlab, as @ranahanocka suggested before?

Currently I assume that the problems I had came from meshes with many holes that could not get simplified "enough" by the blender script.

JoeLambourne commented 4 years ago

@claell, No. I'm using a different meshing library from Autodesk. I was leaving some unused vertices in the model by mistake

claell commented 4 years ago

@JoeLambourne Alright. I guess that the root of our problems is not the same, then. However, it might help others with the same problem.