NVlabs / stylegan3

Official PyTorch implementation of StyleGAN3
Other
6.44k stars 1.14k forks source link

Setting up PyTorch plugin "bias_act_plugin"... Failed! #633

Open kst5137 opened 10 months ago

kst5137 commented 10 months ago

Dear Authors, I get the following errors when running the code

python gen_images.py --outdir=out --trunc=1 --seeds=2 \
    --network=https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-r-afhqv2-512x512.pkl
Generating image for seed 2 (0/1) ...
Setting up PyTorch plugin "bias_act_plugin"... Failed!
Traceback (most recent call last):
  File "/home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1808, in _run_ninja_build
    subprocess.run(
  File "/home/tvconda/anaconda3/envs/yolov8/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/tvconda/DiffAttack/yolov8/stylegan3/gen_images.py", line 143, in <module>
    generate_images() # pylint: disable=no-value-for-parameter
  File "/home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/tvconda/DiffAttack/yolov8/stylegan3/gen_images.py", line 135, in generate_images
    img = G(z, label, truncation_psi=truncation_psi, noise_mode=noise_mode)
  File "/home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "<string>", line 503, in forward
  File "/home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "<string>", line 143, in forward
  File "/home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "<string>", line 92, in forward
  File "/home/tvconda/DiffAttack/yolov8/stylegan3/torch_utils/ops/bias_act.py", line 84, in bias_act
    if impl == 'cuda' and x.device.type == 'cuda' and _init():
  File "/home/tvconda/DiffAttack/yolov8/stylegan3/torch_utils/ops/bias_act.py", line 41, in _init
    _plugin = custom_ops.get_plugin(
  File "/home/tvconda/DiffAttack/yolov8/stylegan3/torch_utils/custom_ops.py", line 136, in get_plugin
    torch.utils.cpp_extension.load(name=module_name, build_directory=cached_build_dir,
  File "/home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1202, in load
    return _jit_compile(
  File "/home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1425, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1537, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1824, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'bias_act_plugin': [1/3] /usr/bin/nvcc  -DTORCH_EXTENSION_NAME=bias_act_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/include -isystem /home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/include/TH -isystem /home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/include/THC -isystem /home/tvconda/anaconda3/envs/yolov8/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' --use_fast_math --allow-unsupported-compiler -std=c++14 -c /home/tvconda/.cache/torch_extensions/py310_cu113/bias_act_plugin/3cb576a0039689487cfba59279dd6d46-nvidia-geforce-rtx-3090/bias_act.cu -o bias_act.cuda.o 
FAILED: bias_act.cuda.o 
/usr/bin/nvcc  -DTORCH_EXTENSION_NAME=bias_act_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/include -isystem /home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/include/TH -isystem /home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/include/THC -isystem /home/tvconda/anaconda3/envs/yolov8/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' --use_fast_math --allow-unsupported-compiler -std=c++14 -c /home/tvconda/.cache/torch_extensions/py310_cu113/bias_act_plugin/3cb576a0039689487cfba59279dd6d46-nvidia-geforce-rtx-3090/bias_act.cu -o bias_act.cuda.o 
/usr/include/c++/11/bits/std_function.h:435:145: error: parameter packs not expanded with ‘...’:
  435 |         function(_Functor&& __f)
      |                                                                                                                                                 ^ 
/usr/include/c++/11/bits/std_function.h:435:145: note:         ‘_ArgTypes’
/usr/include/c++/11/bits/std_function.h:530:146: error: parameter packs not expanded with ‘...’:
  530 |         operator=(_Functor&& __f)
      |                                                                                                                                                  ^ 
/usr/include/c++/11/bits/std_function.h:530:146: note:         ‘_ArgTypes’
[2/3] c++ -MMD -MF bias_act.o.d -DTORCH_EXTENSION_NAME=bias_act_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/include -isystem /home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/include/TH -isystem /home/tvconda/anaconda3/envs/yolov8/lib/python3.10/site-packages/torch/include/THC -isystem /home/tvconda/anaconda3/envs/yolov8/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -c /home/tvconda/.cache/torch_extensions/py310_cu113/bias_act_plugin/3cb576a0039689487cfba59279dd6d46-nvidia-geforce-rtx-3090/bias_act.cpp -o bias_act.o 
ninja: build stopped: subcommand failed.

System Details:

Additional context when i type nvcc --version in conda env, i can get 11.5 becauese of my team use same computer and he need CUDA 11.5 but there is no pytorch previous versions suitable for CUDA 11.5. so download following code

# CUDA 11.3
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch

when i make another project it work without problem but now i meet above error. Is error cause by non-matching of CUDA?

there is some solution about '/usr/include/c++/11' Using an older build of GCC solved issue. so i do

$ sudo apt install gcc-10 g++-10
$ export CC=/usr/bin/gcc-10
$ export CXX=/usr/bin/g++-10
$ export CUDA_ROOT=/usr/local/cuda
$ ln -s /usr/bin/gcc-10 $CUDA_ROOT/bin/gcc
$ ln -s /usr/bin/g++-10 $CUDA_ROOT/bin/g++

it steel not work.... so i do

sudo apt install gcc-10 g++-10
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 10
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-10 10
sudo update-alternatives --config gcc
sudo update-alternatives --config g++ 

it steel not work too...

I'm not good at machine Learning please somebody help me i waste all of my time with this

sans-dev commented 9 months ago

I had similar issues and found that it had something to do with cuda driver and toolkit version. After trying out some without getting somewhere I switched to docker which directly worked for me after installing everything.