pytorch / audio

Data manipulation and transformation for audio signal processing, powered by PyTorch
https://pytorch.org/audio
BSD 2-Clause "Simplified" License
2.5k stars 644 forks source link

cuda runtime error (3) when defining an instance of torchaudio.transforms.MelSpectrogram #518

Closed ahmed-fau closed 4 years ago

ahmed-fau commented 4 years ago

Hi, I am trying to use the MelSpectrogram module of torchaudio 0.4.0 with Pytorch 1.4.0 to calculate mel spectrograms for audio signals during the training. When I run the code on a Tesla P100 SXM2 - 16GB machine, I get the error shown in the following image.

image

This error occurs at the following line of code in my script:

mel1 = torchaudio.transforms.MelSpectrogram(n_mels=128, n_fft=4096, win_length=4096, hop_length=4096//4).to(cuda)

BTW: I tried the solution suggested by soumith in this link but it didn't work.

Any help to understand/fix this problem? with many thanks in advance.

vincentqb commented 4 years ago

First, can you provide us with a minimal yet complete example that reproduces this error?

The comment talks of multiprocessing. Are you doing so?

Have you also tried running your code with the latest master version compiled from source?

ahmed-fau commented 4 years ago

First, can you provide us with a minimal yet complete example that reproduces this error?

Actually, the code works properly on my local machine with is support with an nvidia rtx 2080 GPU, while in case of running the code on a cluster of Tesla machine, I get this cuda initialization error for only this line of code I have already provided.

The comment talks of multiprocessing. Are you doing so?

Yes, I am already running with torch.multiprocessing.set_start_method('spawn) . I also tried to set force=True but this didn't solve the issue.

Have you also tried running your code with the latest master version compiled from source?

No, this occurs with torchaudio 0.4.0

ahmed-fau commented 4 years ago

I tried running with disabling this line related to melspectrogram and the error still there. Sorry for confusion.