Float 16? - Githubissues

facebookresearch / dadaptation

D-Adaptation for SGD, Adam and AdaGrad

MIT License

501 stars 19 forks source link

Closed TKassis closed 1 year ago

TKassis commented 1 year ago

Just to confirm, these optimizes don't support 16 bit precision training yet, correct?

adefazio commented 1 year ago

They don't have native support. I've used them within fairseq (which provides float16 support by wrapping the optimizer) in some of my experiments.