Closed zrthxn closed 7 months ago
The finetune CLI would load the model in float16 and then attempt to train. Doing it this was, the model would unable to converge. Here the model is loaded in float32 and then we allow autocasting where it is needed.
The finetune CLI would load the model in float16 and then attempt to train. Doing it this was, the model would unable to converge. Here the model is loaded in float32 and then we allow autocasting where it is needed.