After adopting half-precision computations for the model, the predictions exhibited substantial discrepancies when compared to those derived from single-precision (fp32) calculations.
And is the model compatible with AMP training? I tried it train with about 1W data. However, the loss fails to decrease, and the model's predictions consistently yield None outputs.
What is the feature?
After adopting half-precision computations for the model, the predictions exhibited substantial discrepancies when compared to those derived from single-precision (fp32) calculations.
Any other context?
No response