facebookresearch / maskrcnn-benchmark

Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.
MIT License
9.31k stars 2.49k forks source link

lr=0.0025 for single GPU training still cause nan in few iteration #1347

Open suxi1111 opened 1 year ago

suxi1111 commented 1 year ago

❓ Questions and Help

help!!! when I train the network on coco, use the turtorial lr=0.0025 still cause nan

iteration : 1, losses : 39.65670394897461 iteration : 2, losses : 22.218917846679688 iteration : 3, losses : 68.60948944091797 iteration : 4, losses : 1266.863037109375 iteration : 5, losses : 332.03045654296875 iteration : 6, losses : 1436176.25 iteration : 7, losses : nan iteration : 8, losses : nan iteration : 9, losses : nan iteration : 10, losses : nan