FP16 for training acceleration

zhao1iang commented 4 years ago

Thanks for your fantastic work. The code is clear and operation manual is detailed. There is still one thing I want know. Does make lxmert support fp16 in your plan? When I reproduce your work in 2080Ti (or other Tesla architecture GPU) which suggests to using fp16 for speeding up training. Unfortunately, I find the current version does not support fp16. I would appreciate it if you(or anyone else) could provide a version supporting fp16.

airsplay commented 4 years ago

Thanks for your attention! I will explore this possibility in the next few days and let you know. I will release it with either --fp16 option or a new branch.

airsplay commented 4 years ago

Hi,

The half-precision support with apex library is now available for pre-viewing. You could switch to it by

git checkout fp16

, and adding argument --fp16 to any bash commands would run with apex, e.g.,

bash run/vqa_finetune.bash 0 vqa_lxr955_tiny --tiny --fp16

I would really appreciate it if you could help test it!

Comments:

Fine-tuning results with fp16 has been verified: VQA: 70.12, GQA: 59.81, NLVR2: 74.26. Pre-training with fp16 has not been verified.
Performance: VQA/GQA/Pre-training's speed is not changed, but the memory is reduced. NLVR2 gets 2x speed gain.

airsplay / lxmert

FP16 for training acceleration #33