REINFORCE works best with large batch sizes; however, if you have a small GPU or use a Google Colab one, your CUDA-mem is limited.
Describe the solution you'd like to have implemented
Support training / eval in fp16.
Additional context
v0 implementation is as easy as calling .half() on the models. The trickier part is to make sure that there is no loss underflow in optimizers. Fresh pytorch supports auto-scaling of losses in optimizers when working with fp16 (which was before done with nvidia's mp). We'd need to double-check we use that and potentially bump pytorch version used.
Is your proposal related to a problem?
REINFORCE works best with large batch sizes; however, if you have a small GPU or use a Google Colab one, your CUDA-mem is limited.
Describe the solution you'd like to have implemented
Support training / eval in fp16.
Additional context
v0 implementation is as easy as calling
.half()
on the models. The trickier part is to make sure that there is no loss underflow in optimizers. Fresh pytorch supports auto-scaling of losses in optimizers when working with fp16 (which was before done with nvidia's mp). We'd need to double-check we use that and potentially bump pytorch version used.