The evaluation process is too slow!!!

DeNA / PyTorch_YOLOv3

Implementation of YOLOv3 in PyTorch

Other

436 stars 148 forks source link

The evaluation process is too slow!!! #56

Closed chengcchn closed 4 years ago

chengcchn commented 4 years ago

Thanks for sharing this repo! I tried to use this repo to train my dataset. When I solved the problems I met and started the training, I found the evaluation process is really slow. I set the eval_interval to be 20 for observing the evaluation, but it was just stuck in there. Looking forward to your reply, thanks!

chengcchn commented 4 years ago

@hirotomusiker

hirotomusiker commented 4 years ago

Hi thank you for using our repo. Yes evaluation usually takes time because the evaluator performs forward processes on ALL the validation images (e.g. 5k images for COCO). Please set eval_interval larger - more than 4000 for example. At iter=20 the model is not trained enough.

chengcchn commented 4 years ago

@hirotomusiker Thanks for your reply. But I am confused that the evaluation speed should have nothing to do with how well the model is trained. I have only made some modification aiming to support my dataset. Moreover, when I run the script "python train.py --cfg config/yolov3_custom_eval.cfg --eval_interval 1 --checkpoint checkpoints/snapshot20.ckpt", the evalution wasn't start.

hirotomusiker commented 4 years ago

OK please see if evaluation is going on by inserting a print line in the inference loop: https://github.com/DeNA/PyTorch_YOLOv3/blob/master/utils/cocoapi_evaluator.py#L62-L91 By default no logs appear during evaluation.

chengcchn commented 4 years ago

Yes, I have tried this and the logs appear. I can see the "test1.", "test2." appear continuously, but the speed is slow.

hirotomusiker commented 4 years ago

In my environment it's about 10 images / sec for image size = 608. How about you?

chengcchn commented 4 years ago

My server has been shut down, I will do the experiment again in the next two days and let you know the result. Thanks a lot!

chengcchn commented 4 years ago

I do the experiment again. I have added "print(A)" in the line 91 in the file "cocoapi_evaluator.py ", and the output is as follow: It seens that the evaluation process is processing, but it's about 8 secs / image for image size =416. However, I also test the coco evaluation, and the speed is fast like 14 images / sec for image size = 608. So is it the evaluation speed relevant to how well the model is trained?

chengcchn commented 4 years ago

Hi, @hirotomusiker. I redo the experiment, strictly check the json dataset I converted. I found that if I used some scripts like "voc2coco.py" to generate the json dataset, an error would occured like issue46 and issue50. I wrote the script myself and this error disappeared. Finally, I did the evaluation after 3000 iters and the evaluation speed is normal! So the evaluation speed is relevant to how well the model is trained. Don't know why.

chengcchn commented 4 years ago

Thanks for your repo again!

hirotomusiker commented 4 years ago

Thank you for reporting. Now I understand the issue. At the beginning of training, the output is random. Noisy boxes are detected everywhere in the image. The postprocess tries to perform non-max suppression (NMS) on ALL the boxes. That's why evaluation takes time. So the only solution is not to evaluate the model at the early stage of training. Thank you!