Open kprastey opened 3 years ago
I have also got this training crash. Pytorch version is '1.8.0a0+1606899'. CUDA version is 11.2. In this envirament, I have successfully trained the model with resnet50-fpn as backbone . But when I use mobilev1-fpn as backbone, it crashed!
@code-wangshuyi did you find out the reason for this crash?
Not able to train on a custom annotated dataset. The losses suddenly explode after a few epochs and training crashes. Please look into this error and help resolve this...
Environment info: Training on google colab with:
The dataset contains 1500 annotated images (1800x1600 each).
-Also let me know if you need any other information. @dbolya