facebookresearch / encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
MIT License
3.44k stars 306 forks source link

Wrong device in average_metrics function #34

Open AndreyBocharnikov opened 1 year ago

AndreyBocharnikov commented 1 year ago

🐛 Bug Report

Since average_metrics function is being called from backward that should be called from every gpu, the devise that is being created here should be equal to the current rank, otherwise torch.distributed.all_reduce will be stucked forever.

adefossez commented 1 year ago

We use Dora for all of our experiments which typically calls torch.cuda.set_device at the beginning of the training with the proper device. That allows to use 'cuda' everywhere after without worrying about the rank of the gpu.