Open marooncn opened 4 years ago
This issue is caused by a mistake in eval.py Line 212: for i in range(valid_dataset.__num_class__()):
The range should be the number of examples in valid_dataset instead of the number of class. Simply change it to: for i in range(len(valid_dataset)):
Hope it helps.
Hi, @Vandmoon Actually I use demo.py to test.
Yeah, I try demo.py, get the same error as you. @marooncn
I think the error is caused for model not trained enough, i got only one box when I load this trained weight to do inference by demo.py
@marooncn, Did you find the solution?
I found that it happens in both VOC and COCO as well. As training goes, all the confidence values are shrinking and then become minimum values.
But, for VOC, if you start training with given checkpoint in README, values are not shrink... weird... I guess this is because of loss function. maybe... multibox loss is better?
I checked the used FocalLoss function, but I did not find something wrong. Maybe a bug takes place in other places.
anything else, I found the cls_loss would be saturated early after a while, so maybe that's an issue should be figured out.
Still have the problem..
deleting the line 'classification = torch.clamp(classification, 1e-4, 1.0 - 1e-4)' in focalloss seems to solve the problem.
@wonderingboy i tried the same,but it didn't work for me.
Stuck in same situation:
@wonderingboy how did that work for you? That line implements the Label Smoothing from Bags of Tricks paper. I would rather not to omit it.
@yrd241, the error seems to be in the classification error... I have not seen a more twisted implementation of the Focal Loss. I suspect the error lies when computing the best anchor for each predicted box. But I am not sure yet. Specifically in line 65 in losses.py. But changing that results into error
Actually I do not see how the loss function is any similar to the google counterpart: https://github.com/google/automl/blob/14548b7175e093c9f0586b372180c41ffc04fbc1/efficientdet/det_model_fn.py#L209
Hi, I try to use your code to train my own dataset. The procedure is normal and the loss is down to 2.1 after 100 epochs training. But when I load this trained weight to do inference I see it output
And I print the max prediction score, it's very close to 0. Maybe the model drops to the minimal value that it predicts nothing whatever is input. Then I try different optimizers and learning rates, the results are the same. Any advice, please?