Open Sonymon opened 6 years ago
Can you try reducing the learning rate and resume the learning from the latest check point?
reduced learning-rate from 0.001 to 0.0001. now nan issue started at step 44681
The problem is not solved yet. Kindly help me out.
maybe you can try increasing the batch size or reducing the learning rate
I have this problem too. I'm training tiny-yolo v2 with 1 or 3 classes (voc2007 class "car", "bus"," motorbike") it happens after few epochs, so i think data annotation is correct lower learning rate does not help. (can not use larger batch size, cuz my GPU only has 4g memory)
any one has solution? thanks EDIT: comment in #793 use --trainer adam can avoid this problem, works for me! :)
maybe you can change the optimizer (ex:adam) and reduce learning rate
Taken the following parameters
I have gone through several related issues but unable to solve: Also checked, if for any annotation xmin>xmax or ymin>ymax. But everything is alright. I have taken 200 images x 8 classes
Used the following command retrain started from 43250 steps flow --model cfg/tiny-yolo-voc-1c.cfg --load 43250 --train --annotation train/annotations --dataset train/images --gpu 0.6 --epoch 4000
System Configuration NVIDIA GEFORCE GTX 1060 6gb Ram 16 gb 256 gb ssd Core i7 7th Gen