Open thisisi3 opened 2 years ago
@thisisi3 The experiment found that the probability of the d2 version appearing NAN is relatively low. We will investigate the reason in the near future. If you are interested, you can participate together.
Sure, I am currently implementing YOLOF in PaddleDetection and will share my findings if I think they are valuable to this issue.
@hhaAndroid @thisisi3
Any update on this? I have nan problem when training with widerface, even reducing my learning rate. Not easy to debug because this always happened in the middle of training.
According to my experience, YOLOF has 1 abnormal loss in every 6 trainings on VOC, so it should successfully train most of the time. I have not trained YOLOF on widerface so couldn't give you any advises, sorry.
@thisisi3
Why there is an abnormal loss? is this a bug?
Hi @twmht To be honest I haven't found the reason why this happens, further investigation is needed.
@thisisi3
nan problem is solved after replacing yolof head with gfl head. However, the accuracy is not good. But In conclusion there might be some bugs in yolof head.
@twmht Interesting, I always thought the problem might be from the assigner, but it seems not.
Maybe it's caused by assigner, since i use atss assigner in gfl head .
@hhaAndroid , @twmht Check out the following post, @yjh0410 probably has found the reason that caused the abnormal loss. https://github.com/thisisi3/Paddle-YOLOF/issues/1#issuecomment-1115545926
Dear Team, In README of YOLOF's config, you mentioned "sometimes there are large loss fluctuations and NAN". I am wondering if this also happens in the original implementation(the detectron2 version), and if you guys have found the reason that caused this issue?