Open HM102 opened 5 years ago
Weight normalization should be the same in train and eval. It sounds like the network had failed to learn anything -- like you say the reconstruction will fail if the SDF values are all positive or all negative. How long had the network trained before you tried to do reconstruction? Did the training loss decrease? One potential problem that could lead to this issue is poor initialization of the network combined with SDF clamping. If the network predicts values outside of the clamping distance there will be no gradient, so if the initialized network predicts all values outside of this range it will not learn anything.
I met the same problem while tring single object reconstruction. The network trained 2000 epoch, during which loss has decreased from 0.042 to 0.0063. While reconstructing, I noticed that all the initialized latent code would output values > 0.1 through well-trained decoder, so there is no gradient for the latent code to learn(only regularization grad will be learned). In other words, the gradient is all 0. I don't know how to change the method of initialization, so I close the clamp module(comment line 58 and 74 in reconstruct.py). After this, the latent code can be correctly learned, and the mesh can be correctly reconstructed.
When I train with small batch sizes (ex.
3,4
), I get error during reconstruction, since the network could not predict negative SDF values. Although during training everything looks fine.If I replace the
decoder.eval()
withdecoder.train()
, I get normal reconstructions. So I guess the problem is with the dropout scaling difference or the weight normalization in training and testing . @tschmidt23 your feedback is much appreciated!