I am trying to train the transformer using multiple GPUs (8 RTX 3090s) following the instructions in the README: "support multiple GPUs export CUDA_VISIBLE_DEVICES=0,1,2,3".
However, the code only utilizes one GPU, regardless of how many devices I set using export CUDA_VISIBLE_DEVICES. It seems that the current code does not support multi-GPU training. Could you confirm if this is the case? If so, could you provide the code or guidance for enabling multi-GPU training?
Sorry, my bad. It works in multi-gpu settings.
(I modified dataloader for training on custom dataset, which caused some memory bottlenecks and it prevents GPU memory I/O)
Hi,
Thank you for your dedicated work.
I am trying to train the transformer using multiple GPUs (8 RTX 3090s) following the instructions in the README: "support multiple GPUs export CUDA_VISIBLE_DEVICES=0,1,2,3".
However, the code only utilizes one GPU, regardless of how many devices I set using export CUDA_VISIBLE_DEVICES. It seems that the current code does not support multi-GPU training. Could you confirm if this is the case? If so, could you provide the code or guidance for enabling multi-GPU training?
Here is my current bash command:
Thank you!