Open DavidTu21 opened 8 months ago
Thanks for your interests in our work! This error seems strange and may not relate to the code itself. I have never encountered this problem, but I guess the dataloader error either comes from multi-processing/multi-workers or memory issues. I would suggest to change patch-size from 32 to a lower value or change the number of workers to be less. Here is a related thread: https://github.com/pytorch/pytorch/issues/8976 P.S. I think 20000ish step is sufficient for training? have you checked the validation results to see if the network has converged?
Hi Tiange,
Thank you very much for your reply and the input. I changed the patch-size in the configs/occnerf/zju_mocap/387/occnerf.yaml from 32 to 16 first and will see how it goes.
Additionally, I didn't see an output of the loss and epochs so I attached the output of prog_019000.png and may I ask do you think we need further training? Thank you again for your time.
Kind regards, David
Hi David, the results look good to me! I think you can stop there and report results :)
Hi Tiange, thank you very much for your input! I will close this issue then :)
Sorry Tiange just one last question, if I'd like to further test the code on a customized data (for example, a 3D body scan with real occlusions, in a similar format as the ZJU mocap data), will the current code be able to handle that?
Yes, it can handle real-world occlusions in any other datasets. But remember to remove the masking schema that is specified in the config file and executed in the data loader. Since the binary human masks for real-world occlusions already show visible body parts, no extra masking is needed.
Dear authors,
Thank you for your amazing work. I just encounter a sudden issue while training on the ZJU mocap 387 example and would like to seek for some advice if possible. Specifically, the problem encounter the following error during the 36 epoch, and I tried to resume the training but the same error occurred. My current output folder contains logs.txt, neural_points_019500.jpg, prog_019000.jpg, neural_points_019000.jpg, prog_018500.jpg, etc. Thank you in advance for your help.
I am running on Ubuntu18.04 with RTX3090.