Open themathgeek13 opened 6 years ago
Couple of updates based on my own experiments (may be useful for future users):
Using the pretrained weights and a batch size of 8 on the AWS p2.xlarge instances (NVIDIA K80 GPU), the losses started around 0.7 or 0.8, and dropped to around 0.4-0.5 after nearly 2500-3000 steps. These are rough figures but should give some idea of what to expect. The mAP calculated by the eval.py script was around 0.25-0.3 at this stage. I expect that this will increase and loss will reduce to 0.1-0.2 as per the graph in the README, as the number of steps increases.
Without pretrained weights - have not yet tried this, will update as soon as I do this.
I have been trying to implement a binarized version of YOLO, but was unable to get the loss to converge beyond 0.7. It oscillates between 0.7 and 0.9, sometimes jumping beyond 1. mAP also oscillates between 0.07 and 0.12. This same binarization worked for MNIST, CIFAR-10 and ImageNet. Suggestions for this are appreciated.
Hi @themathgeek13 . Thanks for your detail experiments. I would like to share my experiments.
Using the newest code with commit 2ca525a and using default parameter. Training in 4 TITAN X GPUS. The mAP reach to >58% after ~10 epochs and loss is ~0.2.
Anyway, If you want to reach >60 mAP, you should adjust lr and loss weights(see lambda in yolo_loss.py) in carefully.
BTW, I had added data augmentation and improve loss layer in the newest code. Please update it.
Any issue please let me know. Thanks.
Thanks for these updates, it will be very useful. Is this training for 58% mAP is from scratch or with pretrained weights?
@themathgeek13 Hi, I had using imagenet pretrained weights in this case. I will try to training from scratch and share you the result. But as you know it will take long time if training from scratch.
Oh that's all right, I just wanted to know because if I modify the network for binarization, I would need to train from scratch since the pretrained weights cannot be used directly. No problem, I will test this myself :+1:
Hi @BobLiu20, I have also a short question about your training. Did you freeze the basenetwork weights (pretrained feature extractor network) at training or did you train the whole network?
@dionysos4 it's not freeze backbone weights. But using different lr. You can see it in params.py file.
Hey @themathgeek13, any luck training the net from scratch?
Did you try training the darknet_53 model using Adam ? Curiously, I notice the convergence is a lot worser than what it is using SGD... this is contrary to my expectations. I hoped Adam to work better due to adaptation of learning rates. I am wondering what might be the cause...
(Orange/Gray plots are for SGD and Green plot is obtained using Adam)
Let me know if you observed same phenonmenon and already know what the reason might be.
@bhargavajs07 Hi, do you trained the net on COCO dataset? Did u succeed in testing your trained model? My trained model had no detections.
@themathgeek13 Hi, have you got a good result by training from scratch? I train COCO from scratch for 100 epoches, but can only get very low mAP which is about 0.10. I'm really confused.
@themathgeek13 with eval_coco.py, it is even lower. Following is the results.
2. Without pretrained weights - have not yet tried this, will update as soon as I do this.
Hi,I have some problem recently,I use the code to train in the voc dataset,but when I test the trained model use image,the output is none!There was no detected bbox in the image,it is only have origin image!Do you know how to solve this problem?Thank you very much!
Hi @BobLiu20, thanks for your excellent code! Can you please provide more details about training? Can it be done without the pretrained weights? How long would it take (how many epochs and with what batch size)? I am not able to get a graph like the one you have shown, the loss is only oscillating around the initial value but not reducing.