Closed janicepan closed 4 years ago
I think a few settings can cause this situation,
@angshine Thank you for the reply! I appreciate the suggestions! Were you able to successfully train it on either KITTI or NYU?
I tried deleting the dropout steps, changing the alpha and beta, and changing K, but I am still getting a constant output.
@janicepan I can train it on KITTI and haven't tried on NYU. I met the outputting constant problem while trying to reproduce the result, and I think my problem was setting the improper beta and alpha. Are you training from scratch or using the weight pretrained on ImageNet?
@angshine I am using the pretrained weights from the pretrained resnet model. I did adjust the alpha and beta based on the min and max range values. Is that correct? How did you find the proper values?
@janicepan Have you checked each layers' output and see if there exists any nan
value? When I debugged this error, I found that the first few layers' output of the backbone have lots of nan
because the pre-trained weights I was using is incorrect.
Thanks again @angshine for the suggestion to check the layer outputs. Through my tests, I didn't find any nan outputs, but I am still unable to train it. Is the default small batch size of 6 working for you? With such a small batch size, I found that I need to use a very very small learning rate in order for the network to not converge immediately to outputting constant images, and the results don't end up looking like anything. I also cannot use a larger batch size, because I run into memory issues with how large the network is. Did you (or anyone else who might come across this post) run into similar issues?
Thanks again @angshine for the suggestion to check the layer outputs. Through my tests, I didn't find any nan outputs, but I am still unable to train it. Is the default small batch size of 6 working for you? With such a small batch size, I found that I need to use a very very small learning rate in order for the network to not converge immediately to outputting constant images, and the results don't end up looking like anything. I also cannot use a larger batch size, because I run into memory issues with how large the network is. Did you (or anyone else who might come across this post) run into similar issues?
Hi,I met the same problem with you when I trained on KITTI, just like you described , it converge so fast and output constant images. Have you solved it? Thank you.
Hi,I met the same problem with you when I trained on KITTI, just like you described , it converge so fast and output constant images. Have you solved it? Thank you.
@janicepan @LCJHust I also met the same issue. Have you solved the problems? I trained the model on NYUv2 dataset and the output of the model is constant. So weird!
This is the picture:
The predicted depth values:
Hi, everyone, I update the implementation of dorn and solve the output problem.
Hey,guys, I meet a difficult problem: I set the '-- c' parameter to the path of 'resnet101_v1c. pth' and attempt to run train.py, but encounter the following error: ruamel. yaml. reader. ReaderError: unacceptable character # x0080: invalid start byte. Could you help me deal this question? Is this step correct? Thanks a lot.
这是来自QQ邮箱的假期自动回复邮件。您好,我已收到您的邮件,将尽快回复。
Has anyone dealt with the model only predicting 1s? I cannot train this model as it is written on the NYU dataset. The validation predictions that are generated while training just come out as constants.