DRAGNLabs / 301r_retnet

2 stars 1 forks source link

Replace fp16 with valid precision option #57

Closed DrewGalbraith closed 3 months ago

DrewGalbraith commented 3 months ago

Before my single commit, training with the default issued the following error:

ValueError: Precision 'fp16' is invalid. Allowed precision values: ('transformer-engine', 'transformer-engine-float16', '16-true', '16-mixed', 'bf16-true', 'bf16-mixed', '32-true', '64-true', 64, 32, 16, '64', '32', '16', 'bf16')

My commit changed the default to bf16, which I can confirm works. For more info on why this is the best option for default, look at the PyTorch Lightning docs for Mixed Precision Training (or ask Jay or me).