The training script you provided is for a single GPU, how can I switch it to 8 GPUs?

I believe the script already supports DistributedDataParallel and we used 8 A100 GPUs for training as we mentioned in the paper. Have u set the CUDA_VISIBLE_DEVICES and --nproc_per_node to the correct GPU env in ur training script？

See the following script for an example: CUDA_VISIBLE_DEVICES=4,5,6,7 torchrun --master_port 10000 --nproc_per_node 4 train_tiktok.py \ This will use GPU 4,5,6,7 and run 4 processes(nproc_per_node=4) for training.

In a similar way, if you'd like to use 8 GPUs: CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --master_port 10000 --nproc_per_node 8 train_tiktok.py \

Boese0601 / MagicDance

The training script you provided is for a single GPU, how can I switch it to 8 GPUs? #11