sp-uhh / storm

StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation
MIT License
164 stars 22 forks source link

Cuda issues #15

Closed lianabagh closed 5 months ago

lianabagh commented 7 months ago

Hi,

I am facing with cuda driver issues. Can you specify your nvcc cuda version, torch cuda version and gcc version?

lianabagh commented 7 months ago

And also GPU.

lianabagh commented 7 months ago

And also GPU driver cuda version ))

jmlemercier commented 7 months ago

Hi @lianabagh , the code for the ppaer ran on torch==1.10+cu11.6 with CUDA version 11.6 But definitely it should also work with more recent CUDA/torch configurations, as soon as they are aligned. I think the stable build is currently torch==2.2.0 with CUDA version 12.1

The GPU model should not have any importance, as soon as your CUDA driver supports it.

jmlemercier commented 7 months ago

Also, it can be that ninja poses problems with torch. If the error is caused by ninja, I'd recommend uninstalling both ninja torch and their dependencies, and reinstall them. Also, don't forget to clean your torch-cuda caches in ~/.cache/torch_extensions after changing your torch / CUDA / ninja versions.

lianabagh commented 7 months ago

Thanks a lot for the answers.

Yes the problem is with ninja. I will try to reinstall all.

On Fri, Feb 9, 2024 at 2:47 PM Jean-Marie Lemercier < @.***> wrote:

Also, it can be that ninja poses problems with torch. If the error is caused by ninja, I'd recommend uninstalling both ninja torch and their dependencies, and reinstall them. Also, don't forget to clean your torch-cuda caches in ~/.cache/torch_extensions after changing your torch / CUDA / ninja versions.

— Reply to this email directly, view it on GitHub https://github.com/sp-uhh/storm/issues/15#issuecomment-1935700727, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIYSWQSBVAPKXF6GY3UHWN3YSX5FJAVCNFSM6AAAAABC55Z47SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZVG4YDANZSG4 . You are receiving this because you were mentioned.Message ID: @.***>

lianabagh commented 6 months ago

The problem is with ninja and it doesn't disappear after doing the changes you suggest. What is your ninja version?

On Fri, Feb 9, 2024 at 5:05 PM Liana Baghdasaryan < @.***> wrote:

Thanks a lot for the answers.

Yes the problem is with ninja. I will try to reinstall all.

On Fri, Feb 9, 2024 at 2:47 PM Jean-Marie Lemercier < @.***> wrote:

Also, it can be that ninja poses problems with torch. If the error is caused by ninja, I'd recommend uninstalling both ninja torch and their dependencies, and reinstall them. Also, don't forget to clean your torch-cuda caches in ~/.cache/torch_extensions after changing your torch / CUDA / ninja versions.

— Reply to this email directly, view it on GitHub https://github.com/sp-uhh/storm/issues/15#issuecomment-1935700727, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIYSWQSBVAPKXF6GY3UHWN3YSX5FJAVCNFSM6AAAAABC55Z47SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZVG4YDANZSG4 . You are receiving this because you were mentioned.Message ID: @.***>

jmlemercier commented 6 months ago

I notice we actually removed the calls to ninja in the current implementation (since it was always causing problems), so I am not sure what your problem is.

jmlemercier commented 6 months ago

If you 1- clean your torch caches as mentioned 2- restart your machine to reinit your CUDA kernels 3- create a new environment and install the packages using the suggested requirements.txt (and optionally 4- re-restart your machine): does this still happen? and if yes, please share a MWE