YunYang1994 / tensorflow-yolov3

🔥 TensorFlow Code for technical report: "YOLOv3: An Incremental Improvement"
https://yunyang1994.gitee.io/2018/12/28/YOLOv3-算法的一点理解/
MIT License
3.63k stars 1.36k forks source link

YoloV3 MNIST reader #221

Open Guitlle opened 5 years ago

Guitlle commented 5 years ago

Hi. First of all, thanks for making this and porting it to tensorflow.

I have trained Yolov3 to read handwritten digits. I have built fake data by putting them at random in a blank image. Then I added noisy images with ugly backgrounds to be able to digitize forms filled by hand.

Checkout the first results here (after training 7 epochs):

https://github.com/Guitlle/tensorflow-yolov3/blob/mnist-reader/Test%20MNIST%20reading.ipynb

Do you have any ideas on how I may improve it? I notice it has very low confidence numbers, between 0.10 and 0.20, even though it already makes great predictions but I guess it has to do with the fact it takes many regions to be numbers when they dont contain any number.

Total loss is stuck at ~6, GIOU is at ~0.9. I am not sure if making the rest of epochs will help to improve, as I don't see it go down.

Despite this, it has worked great without any changes and with fake data.

Perhaps I need noisy-real data for it to improve.

Thanks, Guillermo.

YunYang1994 commented 5 years ago

Hi, Thanks for your feedback. I strongly recommend you to train it from scratch since your dataset is totally different from COCO. As for what you mentioned that you didn't see the loss go down even after training 7 epochs, which could be explained by your learning rate schedule. The learning rate was expected to go up at first 2 epochs and then gradually go down at the rest of epochs. so when the lr value is very low, The total loss would start to go down. why didn't you wait for it to finish training process ? and how many pictures did you feed to the neural network ?

Guitlle commented 5 years ago

Hi. I did train it from scratch, without coco weights. I am impatient, so I just wanted to test it with the checkpoints. Now I started from scratch again but with noisy images. I used 1000 fake data images before, now I am going with 500. It is taking 30 minutes per epoch now. It looks like it is evolving better with this data and test loss and training loss look much better now (almost the same magnitude)