Closed EcustBoy closed 1 year ago
I have the same problem when training model on carla data. I choose multi stage training, the nan error comes when the second stage(Prediction) in iteration N.
@BeautyCJ Hi, i try to add NaN detect code to check reason, and i found it's inevitable that some network layer can output INF value during model forward calculation, so finally i choose fp32 training mode in the planning stage and freeze some module pretrained from the prediction stage, in this way the whole training can be done
@BeautyCJ Hi, i try to add NaN detect code to check reason, and i found it's inevitable that some network layer can output INF value during model forward calculation, so finally i choose fp32 training mode in the planning stage and freeze some module pretrained from the prediction stage, in this way the whole training can be done
thank you for your information. hopefully it could shed some light on future attempts.
Dear author: