What changed to enable Torch's built-in Nvidia AMP mixed precision training with AllenNLP (>v1.1.0rc2) for significant speedup:
Enable amp in config with setting trainer: { use_amp: true,
Fix RuntimeError: value cannot be converted to type Half without overflow: <value> errors: added .float() call on tensors in reduction operations. Reductions are sensitive to overflow if you are using FP16. All reductions should be performed in FP32 just to make sure to get a valid result. Autograd will rewind this operation in the backward pass so that the model will still be in half precision.
I have only patched calls for training ace-event with default-settings, for other tasks where other reduction functions are called .float() should also be added to the tensors.
I have not benchmarked the speedup with AMP on training e.g. Ace-event yet.
In any case this PR is tested and working for training events.
What changed to enable Torch's built-in Nvidia AMP mixed precision training with AllenNLP (>v1.1.0rc2) for significant speedup:
trainer: { use_amp: true,
RuntimeError: value cannot be converted to type Half without overflow: <value>
errors: added.float()
call on tensors in reduction operations. Reductions are sensitive to overflow if you are using FP16. All reductions should be performed in FP32 just to make sure to get a valid result. Autograd will rewind this operation in the backward pass so that the model will still be in half precision.I have only patched calls for training ace-event with default-settings, for other tasks where other reduction functions are called
.float()
should also be added to the tensors. I have not benchmarked the speedup with AMP on training e.g. Ace-event yet. In any case this PR is tested and working for training events.