Open jim464190755 opened 12 months ago
When I train on coco dataset, I find that the clipped_grad_norm value is NaN and total_loss is difficult to decrease, what is wrong or what might be the reason for this?
When I train on coco dataset, I find that the clipped_grad_norm value is NaN and total_loss is difficult to decrease, what is wrong or what might be the reason for this?