sp-uhh / sgmse

Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation
MIT License
454 stars 69 forks source link

OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root. #6

Closed snufas closed 7 months ago

snufas commented 2 years ago

not sure what the problem is...

(sgmse) root@DESKTOP-7A6UGRU:/home/snufas/github_projects/sgmse# python enhancement.py --test_dir input --enhanced_dir output --ckpt models/Speech_Enhancement/train_wsj0_2cta4cov_epoch=159.ckpt Traceback (most recent call last): File "enhancement.py", line 10, in from sgmse.model import ScoreModel File "/home/snufas/github_projects/sgmse/sgmse/model.py", line 11, in from sgmse.backbones import BackboneRegistry File "/home/snufas/github_projects/sgmse/sgmse/backbones/init.py", line 2, in from .ncsnpp import NCSNpp File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp.py", line 18, in from .ncsnpp_utils import layers, layerspp, normalization File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/layerspp.py", line 20, in from . import up_or_down_sampling File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/up_or_down_sampling.py", line 10, in from .op import upfirdn2d File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/op/init.py", line 1, in from .fused_act import FusedLeakyReLU, fused_leaky_relu File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/op/fused_act.py", line 11, in fused = load( File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1202, in load return _jit_compile( File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1425, in _jit_compile _write_ninja_file_and_build_library( File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1514, in _write_ninja_file_and_build_library extra_ldflags = _prepare_ldflags( File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1622, in _prepare_ldflags extra_ldflags.append(f'-L{_join_cuda_home("lib64")}') File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 2125, in _join_cuda_home raise EnvironmentError('CUDA_HOME environment variable is not set. ' OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root. (sgmse) root@DESKTOP-7A6UGRU:/home/snufas/github_projects/sgmse#

Thanks for help......

snufas commented 2 years ago

I managed to get rit of the error above how ever another popped up

(sgmse) root@DESKTOP-7A6UGRU:/home/snufas/github_projects/sgmse# python enhancement.py --test_dir input --enhanced_dir output/ --ckpt models/Speech_Enhancement/train_wsj0_2cta4cov_epoch=159.ckpt Traceback (most recent call last): File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1808, in _run_ninja_build subprocess.run( File "/root/miniconda3/envs/sgmse/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "enhancement.py", line 10, in from sgmse.model import ScoreModel File "/home/snufas/github_projects/sgmse/sgmse/model.py", line 11, in from sgmse.backbones import BackboneRegistry File "/home/snufas/github_projects/sgmse/sgmse/backbones/init.py", line 2, in from .ncsnpp import NCSNpp File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp.py", line 18, in from .ncsnpp_utils import layers, layerspp, normalization File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/layerspp.py", line 20, in from . import up_or_down_sampling File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/up_or_down_sampling.py", line 10, in from .op import upfirdn2d File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/op/init.py", line 1, in from .fused_act import FusedLeakyReLU, fused_leaky_relu File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/op/fused_act.py", line 11, in fused = load( File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1202, in load return _jit_compile( File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1425, in _jit_compile _write_ninja_file_and_build_library( File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1537, in _write_ninja_file_and_build_library _run_ninja_build( File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1824, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension 'fused': [1/2] /usr/local/cuda-11.7/bin/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/include -isystem /root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/include/TH -isystem /root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-11.7/bin/include -isystem /root/miniconda3/envs/sgmse/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_52,code=compute_52 -gencode=arch=compute_52,code=sm_52 --compiler-options '-fPIC' -std=c++14 -c /home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o FAILED: fused_bias_act_kernel.cuda.o /usr/local/cuda-11.7/bin/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/include -isystem /root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/include/TH -isystem /root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-11.7/bin/include -isystem /root/miniconda3/envs/sgmse/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_52,code=compute_52 -gencode=arch=compute_52,code=sm_52 --compiler-options '-fPIC' -std=c++14 -c /home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o /bin/sh: 1: /usr/local/cuda-11.7/bin/bin/nvcc: not found ninja: build stopped: subcommand failed.

(sgmse) root@DESKTOP-7A6UGRU:/home/snufas/github_projects/sgmse#

cobalamin commented 2 years ago

/bin/sh: 1: /usr/local/cuda-11.7/bin/bin/nvcc: not found

Some path seems to be set up wrong here (there's a duplicate bin: .../bin/bin/...). Could you maybe look back into how you fixed the missing CUDA_HOME problem, and check if there's an extra bin that shouldn't be there?

snufas commented 2 years ago

@cobalamin now i am geting no errors, but no output eather... image

cobalamin commented 2 years ago

It seems the script didn't find any files. Does your test_dir match the required structure, in particular are the .wav files stored in a subdirectory named noisy/? (see python enhancement.py --help).

snufas commented 2 years ago

@cobalamin some more debug info for you image

"particular are the .wav files stored in a subdirectory named noisy/?" I created 2 dirs 1 input that is where the wav file exists 2 output that is where i want the enhanced file to go

hence python enhancement.py --test_dir input --enhanced_dir output --ckpt models/Speech_Enhancement/train_wsj0_2cta4cov_epoch=159.ckpt

is that wrong? Thanks for help...

julius-richter commented 2 years ago

Please create another folder named "noisy" inside your folder named "input" and put the files to be processed there. Then the call python enhancement.py --test_dir input --enhanced_dir output --ckpt models/Speech_Enhancement/train_wsj0_2cta4cov_epoch=159.ckpt should work.

See: https://github.com/sp-uhh/sgmse/blob/cfe2704ee22339e400fb83512924e7fb69eaec53/enhancement.py#L24

snufas commented 1 year ago

@julius-richter Have no ideas what to do with this:

root@DESKTOP-7A6UGRU:/home/snufas/github_projects/sgmse# conda activate sgmse (sgmse) root@DESKTOP-7A6UGRU:/home/snufas/github_projects/sgmse# python enhancement.py --test_dir input --enhanced_dir output --ckpt models/Speech_Enhancement/train_wsj0_2cta4covepoch=159.ckpt 0%| | 0/1 [00:02<?, ?it/s] Traceback (most recent call last): File "enhancement.py", line 63, in sample, = sampler() File "/home/snufas/github_projects/sgmse/sgmse/sampling/init.py", line 60, in pc_sampler xt, xt_mean = corrector.update_fn(xt, vec_t, y) File "/home/snufas/github_projects/sgmse/sgmse/sampling/correctors.py", line 77, in update_fn grad = self.score_fn(x, t, args) File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "/home/snufas/github_projects/sgmse/sgmse/model.py", line 142, in forward score = -self.dnn(dnn_input, t) File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp.py", line 304, in forward h = modules[m_idx](hs[-1], temb) File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/layerspp.py", line 243, in forward h = self.act(self.GroupNorm_0(x)) File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/nn/modules/activation.py", line 391, in forward return F.silu(input, inplace=self.inplace) File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/nn/functional.py", line 2048, in silu return torch._C._nn.silu(input) RuntimeError: CUDA error: unknown error CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. (sgmse) root@DESKTOP-7A6UGRU:/home/snufas/github_projects/sgmse#

any suggestions???? Thanks for your input...

cobalamin commented 1 year ago

@snufas This does not look like a problem with our code since it errors deep within the implementation of PyTorch. I sometimes have errors like that occur on my machine when I wake it up from sleep mode. Have you tried restarting your machine, to force the CUDA kernel modules to reinitialize?

If you have restarted your machine and it doesn't help, can you try calling the following, as the error message suggests:

CUDA_LAUNCH_BLOCKING=1 python enhancement.py --test_dir input --enhanced_dir output --ckpt models/Speech_Enhancement/train_wsj0_2cta4cov_epoch=159.ckpt

and see if you get some additional debugging output?

snufas commented 1 year ago

@cobalamin Nothing, "CUDA_LAUNCH_BLOCKING=1" didn't give any extra information. I didn't have problems like this in any other project, I am not saying that it is a project's fault, may be due to the specific configuration. 3 things to note though 1 I am running on wsl2 2 my GPU is quite old NVIDIA GeForce GTX 980 3 also I am still not able to get the cuda home path for some reason.

So yeah, I doubt that I will be able to run this project... But thank you very much for your input and suggestions, I really appreciate that. Feel free to close this issue yourself, I giving up on it, although samples sounds very good indeed.

osimantir commented 1 year ago

not sure what the problem is...

(sgmse) root@DESKTOP-7A6UGRU:/home/snufas/github_projects/sgmse# python enhancement.py --test_dir input --enhanced_dir output --ckpt models/Speech_Enhancement/train_wsj0_2cta4cov_epoch=159.ckpt Traceback (most recent call last): File "enhancement.py", line 10, in from sgmse.model import ScoreModel File "/home/snufas/github_projects/sgmse/sgmse/model.py", line 11, in from sgmse.backbones import BackboneRegistry File "/home/snufas/github_projects/sgmse/sgmse/backbones/init.py", line 2, in from .ncsnpp import NCSNpp File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp.py", line 18, in from .ncsnpp_utils import layers, layerspp, normalization File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/layerspp.py", line 20, in from . import up_or_down_sampling File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/up_or_down_sampling.py", line 10, in from .op import upfirdn2d File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/op/init.py", line 1, in from .fused_act import FusedLeakyReLU, fused_leaky_relu File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/op/fused_act.py", line 11, in fused = load( File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1202, in load return _jit_compile( File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1425, in _jit_compile _write_ninja_file_and_build_library( File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1514, in _write_ninja_file_and_build_library extra_ldflags = _prepare_ldflags( File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1622, in _prepare_ldflags extra_ldflags.append(f'-L{_join_cuda_home("lib64")}') File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 2125, in _join_cuda_home raise EnvironmentError('CUDA_HOME environment variable is not set. ' OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root. (sgmse) root@DESKTOP-7A6UGRU:/home/snufas/github_projects/sgmse#

Thanks for help......

Hi! I'm getting the same error. How did you solve it? Thank you.

tvanh512 commented 1 year ago

"OSError: CUDA_HOME environment variable is not set" could be solved with "conda install -c conda-forge cudatoolkit-dev"

raimmm commented 1 month ago

I managed to get rit of the error above how ever another popped up

(sgmse) root@DESKTOP-7A6UGRU:/home/snufas/github_projects/sgmse# python enhancement.py --test_dir input --enhanced_dir output/ --ckpt models/Speech_Enhancement/train_wsj0_2cta4cov_epoch=159.ckpt Traceback (most recent call last): File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1808, in _run_ninja_build subprocess.run( File "/root/miniconda3/envs/sgmse/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "enhancement.py", line 10, in from sgmse.model import ScoreModel File "/home/snufas/github_projects/sgmse/sgmse/model.py", line 11, in from sgmse.backbones import BackboneRegistry File "/home/snufas/github_projects/sgmse/sgmse/backbones/init.py", line 2, in from .ncsnpp import NCSNpp File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp.py", line 18, in from .ncsnpp_utils import layers, layerspp, normalization File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/layerspp.py", line 20, in from . import up_or_down_sampling File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/up_or_down_sampling.py", line 10, in from .op import upfirdn2d File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/op/init.py", line 1, in from .fused_act import FusedLeakyReLU, fused_leaky_relu File "/home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/op/fused_act.py", line 11, in fused = load( File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1202, in load return _jit_compile( File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1425, in _jit_compile _write_ninja_file_and_build_library( File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1537, in _write_ninja_file_and_build_library _run_ninja_build( File "/root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1824, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension 'fused': [1/2] /usr/local/cuda-11.7/bin/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/include -isystem /root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/include/TH -isystem /root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-11.7/bin/include -isystem /root/miniconda3/envs/sgmse/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_52,code=compute_52 -gencode=arch=compute_52,code=sm_52 --compiler-options '-fPIC' -std=c++14 -c /home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o FAILED: fused_bias_act_kernel.cuda.o /usr/local/cuda-11.7/bin/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/include -isystem /root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/include/TH -isystem /root/miniconda3/envs/sgmse/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-11.7/bin/include -isystem /root/miniconda3/envs/sgmse/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_52,code=compute_52 -gencode=arch=compute_52,code=sm_52 --compiler-options '-fPIC' -std=c++14 -c /home/snufas/github_projects/sgmse/sgmse/backbones/ncsnpp_utils/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o /bin/sh: 1: /usr/local/cuda-11.7/bin/bin/nvcc: not found ninja: build stopped: subcommand failed.

(sgmse) root@DESKTOP-7A6UGRU:/home/snufas/github_projects/sgmse#

Excuse me, how did you solve this problem