Open phosseini opened 5 years ago
The code in this repo was not written to support multi-GPU training (mainly because I only have the one). But, the code that this is based on does support multi-GPUs. You should be able to get it to work with only a few changes.
I have two GPUs (2 x NVIDIA Tesla V100) and I'm running the codes in
run_model.ipynb
on Google Cloud. I get the CUDA out of memory exception when I want to run my code with a sequence length longer than 128 for greater batch sizes.I wonder if I need to make any changes to the code to make it runnable using multiple GPUs? I think I shouldn't get the out of memory error considering the number of GPUs I have and their memory (please correct me if I'm wrong.)