Traceback (most recent call last):
File "/home/raid/dtishencko/git/goblin/train/StyleTTS2/train_finetune_accelerate.py", line 284, in main
ppgs, s2s_pred, s2s_attn = model.text_aligner(mels, mask, texts)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/parallel/data_parallel.py", line 171, in forward
raise RuntimeError("module must have its parameters and buffers "
RuntimeError: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found one of them on device: cuda:2
CUDA
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Jun__8_16:49:14_PDT_2022
Cuda compilation tools, release 11.7, V11.7.99
Build cuda_11.7.r11.7/compiler.31442593_0
I get an error when I try to execute a finetune script on multiple GPUs:
Error
CUDA
Config
I have tried various combinations of hyperparameters of batch_size and number of GPUs, the error is the same.