Unstable Fitting - Githubissues

wangmiaowei commented 1 year ago

企业微信截图_015a68db-d319-4f9f-a8ce-86e0cb8ee2c9

Could you give me some suggestions that why there are NaN losses in the fitting procedure.

ChocolateBiscuit commented 1 year ago

I am not sure why you are receiving NaNs, I have rerun a few of my examples and I do not see NaNs. To help debug can you provide the following:

the mesh you are using (one of ours or your own?)
the config file
which loss is becoming NaN (you can check from tensorboard)

wangmiaowei commented 1 year ago

Sorry for the late reply. I use my mesh and update the config file. What you need are pasted here:

Only Consistency Loss gets NaN value
My input mesh is pasted here(as .zip file):

muskV2.obj.zip

My config file is zipped here: example_config.yml.zip

ChocolateBiscuit commented 1 year ago

How our consistency loss is implemented is that it maps each visible vertex to a pixel in the 224x224 images passed to CLIP. Visibility is computed by the nvdiffrast rasterizer. Then, it compares corresponding visible vertices in different batches.

I took a look at your configuration. Because your mesh has 100k+ vertices, when rendering to a low resolution like 224x224, there will be a ton of occlusions calculated by our method. Since your batch size is relatively small (5), due to these occlusions, even though it may look like shared vertices are being rendered between the batched views, the actual calculated visible vertices may be slightly different between each of the views. Our method also filters view-pairs based on how close their viewing angles are. Due to your batch size, it may be the case that all pairs are filtered out. When the loss is nan, it means that our method did not find any shared vertices between the unfiltered view-pairs.

There are a few ways you can address this, in my recommendation order -- you may try increasing the batch size, if your GPU memory permits. To enable this, you could try parallelizing our method over multiple GPUs. Increasing the batch size may also generally improve your results. You can also increase the permissible difference between viewing angles (the consistency_elev_filter and consistency_azim_filter parameters). You may also try simplifying the input mesh (our face results came from a mesh with only around 10k vertices). Alternatively, you may ignore these nans since no gradients are passed (the nan results from the mean of an empty tensor).

threedle / TextDeformer

Unstable Fitting #2