Closed dagongji10 closed 3 years ago
@dagongji10 Did you use provided pretrained model? Have you tried using a larger batch size?
@Yuliang-Liu The result is all Nan-loss whether use or not use pretrained model. My batch-size is 1 because I have only 1 GPU.
Has your problem been solved, I also encountered a similar situation?
FloatingPointError: Loss became infinite or NaN at iteration=77389! loss_dict = {'rec_loss': 4.122969746589661,'loss_fcos_cls': nan,'loss_fcos_loc': 0.6235280930995941,'loss_fcos_ctr': 0.6824557185173035,'loss_fcos_bezier': 1.2363877594470978}
Is there any way to solve the above problems? I hope you can advise me, thanks! @dagongji10 @Yuliang-Liu
@Drangonliao123 I only got this problem when I use Chinese-English-Mix handwritten dataset. When I change my dataset with only Chinese, the problem missed. So, I think it's because my dataset quality is too low, Chinese-English-Mix handwritten data is messy, even I can't recognize what it is. Maybe you can check your data image and ensure you can recognize the text by yourself.
thank you! After I try to change the learning rate to 0.0001, there is no error! But I think you have a point! Thanks again!
I use ABCNet to train my own dataset. The dataset sample just like: Use
abcnet_custom_dataset_example_v2
to annotation and check it is right format. Config as follow:I also try to change
LOSS_WEIGHT
to 0.5~0.8,BASE_LR
to 0.00001, but still get the same problem:Can anyone help me with this problem or give me some advise?