nyu-mll / jiant

jiant is an nlp toolkit
https://jiant.info
MIT License
1.64k stars 296 forks source link

how to use multi-gpus to finetune large datasets like ReCoRD? #1338

Open runzeer opened 3 years ago

zphang commented 3 years ago

The main runscript should automatically detect the number of GPUs available and run with DataParallel. Are you encountering any issues doing so?

runzeer commented 3 years ago

when using the DataParallel, the memory cost in each GPU is not equal. So if convenient, could you change the mode to DDP?

zphang commented 2 years ago

Yep, we are aware that alternative multi-GPU methods like DDP have some advantages over DataParallel, but switching to DDP will introduce non-trivial complexity, and we do not currently have plans to incorporate it, although we are not ruling it out for the future.