Closed EricLWJ closed 1 year ago
Sorry, turns out that i over tuned some parameters that caused the training to be like this. I adjust the eval_frequency to a small number which causeed the validation to happen too frequently. To fix this, make sure eval_frequency in train_linemod_pvn3d is set to a reasonable size, not too small.
Hello,
I am trying to run the PVN3D for the linemod dataset and have this current issue of being stuck in a loop during training. Each training progress you see in the image below is a completed run of the training and validation for the 1st epoch. After it finishes with the train and the validation, it prints out the training progress and starts again without moving to the next epoch. It keeps being stuck on the first epoch and the % does not increase. It would be much appreciate if you could give some advice on how to remedy this situation. Thank you.
Details: Build according to the readme instructions Training on the linemod ape dataset as per instructions Had to adjust some parameters like mini_batch_size=1 due to limited GPU memory Adjusted some parameters like n_total_epoch and eval_frequency to reduce the number of steps and see if it would break out of loop. Left the training on for 14hrs to see if it would break from the loop. It does not.