Open anvenkat09 opened 6 years ago
@anvenkat09 still got this problem?
Hey Experiencor,
I am actually getting a similar problem but on the recall. On certain images, it seems that the recall is simply 0. This affects the average recall and after a while, it simply becomes 0. I read somewhere that I should probably be looking into using clipnorm or clipvalue to clip the gradient from exploding and use a grid search to find proper values for that.
What do you think? Am I doing something wrong?
Thanks :)
@experiencor No I fixed it. Don't remember exactly how, but it's not an issue anymore. It had something to do with my pretrained weights
@11mhg Recall approaching 0 is another problem. It normally happens during warm-up, but not actual training. Please refer to the readme for tips.
Hey @experiencor , I followed the readme you have, and I noticed I'm still receiving this problem. I allowed the net to perform warmup and then I did proper training and the actual recall never changed from 0.
@11mhg What dataset did you use? You can try the whole thing again as you may miss something in the first run, e.g. you need to load the warmed-up weights before the actual training process.
Hey! Thanks for the quick replies.
I'm using the Coco dataset with just a small amount of labels chosen. When I warm it up, it starts at a decent recall and goes down to zero. I load the warmed up weights and then do training proper and it seems that the recall starts at zero and never goes up. I'm currently training again with some modified parameters, so I'll let you know how that goes, but if you happen to have insight on Coco, that would be great!
Thanks again!
@11mhg The warmup training looks right to me, but the actual training is odd. You may try to train the detector for just one class to see how it goes.
So what I'm finding is that the recall converges quickly to zero during warmup, but when I do the training it very slowly increases recall. So for the one label "person", after 56 or so epochs I get a recall average of 0.0085..... What is a normal recall average I should get?
EDIT: To note, I'm using the 2017 dataset and adapted the parse annotations to use the cocoapi
@11mhg Current recall (an estimate of mAP) should be more than 0.3 for good detections in my experience. I assume that you have carefully checked the labels of the images.
Yes, I've verified the labels and done a multitude of tests on the parse annotations to make sure it is correct. I will take a look at using perhaps an older COCO dataset to see what happens.
okay, what I've found is that simply using the weights trained on COCO 2014 from darknet, I managed to fine tune on the COCO 2017 dataset with a recall of around 0.34!
@anvenkat09 can you tell me what you do, i have same problem with resnet
Trying to update the nan related issues - what worked for me is adding images and annotations added to "valid_image_folder" (I previously relied on having the training ones split 80/20 as per the readme - but I got nan for losses). I also changed the traing nb_epochs to 10 - and will likely need more (from 1)
Actually, it might have to do with model anchors. Generating new ones (other than the ones given in the README example) lead to the NAN values.
Hi Experiencor,
Wonderful implementation! I just had a couple of questions:
I have written my own feature extractor CNN similar to ResNet and have been using it in place of the Full-Yolo classifier in your step by step notebook. After about 140 epochs of training, the loss becomes NAN. I came up with a couple of options, could you please recommend what I should do?
Thanks so much!