Open aaron-hgx opened 1 week ago
While I am training with scale aware pretraining, I am getting NAN values in losses. What could be the possible issue? Without scale aware pretraining the losses seem to be fine.
While I am training with scale aware pretraining, I am getting NAN values in losses. What could be the possible issue? Without scale aware pretraining the losses seem to be fine.