Closed matteobarato closed 1 year ago
These are some example plots obtained with the visualize_train function on the custom dataset
Hi @matteobarato did you solve this issue. Me too after some epochs I start having nan for: "vote_loss: nan seg_loss: nan loss: nan " and the same while running on my own dataset. I test the first wight and I had the box but not on my object and for the other wights I had no detection at all.
@matteobarato how did you solve this issue?
Problem Description
I am encountering an issue while training PVNet on a custom dataset. I have created a custom dataset using Blendeproc and adapted it to the required format. However, during training, after approximately 20-30 epochs, I am achieving an Average Precision (AP) score of 1.0, but all other metrics remain stuck at 0.0. I have tested this behavior with both my custom dataset and the provided custom dataset mentioned in the Readme, and the issue persists in both cases.
I have attempted to address this problem by training for longer epochs (e.g., 350 epochs), adjusting the learning rate, and experimenting with and without data augmentation, but the metrics remain unchanged. Strangely, when I train PVNet on the Linemod dataset, after 20-30 epochs, all the scores increase as expected and do not stay at zero.
Furthermore, when visualizing the bounding boxes, I can see that the correct bounding boxes are drawn for my custom dataset and the dataset provided in the Readme. Similar visualizations are observed when I run
visualize_train
.I suspect that there might be a bug in the training or evaluation code for custom datasets, or potentially in the dataset preprocessing step using
python run.py --type custom
.Steps to Reproduce
To reproduce the issue:
Expected Behavior
I expect the metrics for all objects to increase and not remain stuck at 0.0, similar to the behavior observed when training on the Linemod dataset.
Please let me know if there is any additional information or logs required to diagnose and resolve this issue.