Closed bscout9956 closed 4 years ago
There we go 👍 last commit didn't get merged cause I submitted after your edit and branching happened.
I've added support to set use_amp as an option in the training config file in my branch. If it is false, it uses a dummy cast scope, and if it is true then it will use cuda.amp.autocast. Should I create a new pull request?
Since BA's not very confident about the effects of AMP on training and my implementation lacks any sort of toggles and etc, I have decides to let Dinjerr do the job instead. I'm closing this PR.
Also know as Automatic Mixed Precision for Deep Learning. Improves performance for those who are running Ampere, Volta and Turing GPUs.
Code based on DinJerr's fork.