Open ustczhouyu opened 4 years ago
That's odd, but without more info I can't really provide more help. Have you resolved it?
Are you working with a single GPU? If so did you decrease the batch size so that the batch fits into GPU memory? If yes to both:
Set the SOLVER.BASE_LR in your model_config.yaml file about an order of magnitude lower (for example, set it to 0.0025).
Having a larger batch size gives you stability allowing you to increase learning rate. When batch size goes down, a good rule of thumb is that the learning rate should go down as well.
@HashiamKadhim @mrlooi Thank you very much, when i set the lr to 0.005, it works. But when I train the model on a dataset containing many small objects, I encountered other difficulties. 1. The model will detect two or more small objects that are close together in the horizontal or vertical direction as one. 2. Due to the complex background of this dataset, some backgrounds are even similar to the texture of the foreground, leading to some false positives. What should I do to solve these two problems? (For example, which parameters should be modified or what kind of branch should be added?) Please help me.
Hi! I also came into the same issue, I did some tests and the grad is always nan
❓ Questions and Help
help!! when I train my own dataset, the loss is nan at the begining, can anybody tell me how to deal with it? thanks a lot!! @mrlooi