Closed PonteIneptique closed 1 year ago
Maybe we can add a line in the training advice to encourage people to use mixed precision.
If device CUDA -> "You can optimize training by using precision 16" ?
Do you mean to print a warning when training on GPU using 32-bit precision? I was just thinking about adding a line to the documentation.
FYI, I'd probably add a sanity check here:
RuntimeError: expected scalar type BFloat16 but found Float
(Note that I just forgot to add
--device cuda
and that's how I found out)