google-research / electra

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
Apache License 2.0
2.31k stars 351 forks source link

How can I make ELECTRA pretraining/dataloading use only one gpu ? #86

Closed richarddwang closed 3 years ago

richarddwang commented 3 years ago

I have a 4 gpu server, and I found even I just load data use input_func from get_input_fn, it use all my 4 gpus for 305Mib memory space each.

anshulsamar commented 3 years ago

@richarddwang Perhaps try setting CUDA_VISIBLE_DEVICES? Or do you still have an issue...

richarddwang commented 3 years ago

Thank you, @anshulsamar ! I set os.environ["CUDA_VISIBLE_DEVICES"] = "3" and it use just cuda:3

May I ask one more question that now it takes only 305MiB on one GPU (GV100, fast as V100) and the training process is slow (1.1 % after 6hr), I am using default config. Is there any suggestion ?

anshulsamar commented 3 years ago

No problem, not sure about that one...