Open xuboming8 opened 3 years ago
You may have to use older version of gcc. Or, simply it may be better to use official cuda-dev docker images.
I have the same problem. I've tried to run the convert_weight.py
several times with pytorch 1.3
, pytorch 1.4
, pytorch 1.7
, and also gcc 5.2
, and gcc 4.8
, gcc 7.3
, but every time I got the same error as this issue. I used cuda 10.1
in all my experiments.
@denabazazian Could you try official docker images like nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04 + build-essential & pytorch & ninja? It will work without additional configurations.
@rosinality I tried to use the same Dockerfile, But for some reason facing issues with running tensorflow-gpu. Would it be possible for you to provide the Dockerfile you used? Thanks
@Nerdyvedi For tensorflow problems, it would be easier to use cuda 10 images.
FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu18.04
ARG APT_INSTALL="apt-get install -y --no-install-recommends"
ARG PIP_INSTALL="python -m pip --no-cache-dir install --upgrade"
ARG GIT_CLONE="git clone --depth 10"
ENV HOME /root
WORKDIR $HOME
RUN rm -rf /var/lib/apt/lists/* \
/etc/apt/sources.list.d/cuda.list \
/etc/apt/sources.list.d/nvidia-ml.list
RUN apt-get update
ARG DEBIAN_FRONTEND=noninteractive
RUN $APT_INSTALL build-essential software-properties-common ca-certificates \
wget git zlib1g-dev nasm cmake
RUN add-apt-repository ppa:deadsnakes/ppa
RUN apt-get update
RUN $APT_INSTALL python3.7 python3.7-dev
RUN wget -O $HOME/get-pip.py https://bootstrap.pypa.io/get-pip.py
RUN python3.7 $HOME/get-pip.py
RUN ln -s /usr/bin/python3.7 /usr/local/bin/python3
RUN ln -s /usr/bin/python3.7 /usr/local/bin/python
RUN $PIP_INSTALL setuptools
RUN $PIP_INSTALL numpy scipy nltk lmdb cython pydantic pyhocon
RUN $PIP_INSTALL torch==1.7.1+cu92 torchvision==0.8.2+cu92 -f https://download.pytorch.org/whl/torch_stable.html
ENV FORCE_CUDA="1"
ENV TORCH_CUDA_ARCH_LIST="Pascal;Volta;Turing"
RUN $APT_INSTALL libsm6 libxext6 libxrender1
RUN $PIP_INSTALL opencv-python-headless
RUN python -m pip uninstall -y pillow pil jpeg libtiff libjpeg-turbo
RUN $GIT_CLONE https://github.com/libjpeg-turbo/libjpeg-turbo.git
WORKDIR libjpeg-turbo
RUN mkdir build
WORKDIR build
RUN cmake -G"Unix Makefiles" -DCMAKE_INSTALL_PREFIX=libjpeg-turbo -DWITH_JPEG8=1 ..
RUN make
RUN make install
WORKDIR libjpeg-turbo
RUN mv include/jerror.h include/jmorecfg.h include/jpeglib.h include/turbojpeg.h /usr/include
RUN mv include/jconfig.h /usr/include/x86_64-linux-gnu
RUN mv lib/*.* /usr/lib/x86_64-linux-gnu
RUN mv lib/pkgconfig/* /usr/lib/x86_64-linux-gnu/pkgconfig
RUN ldconfig
RUN CFLAGS="${CFLAGS} -mavx2" $PIP_INSTALL --force-reinstall --no-binary :all: --compile pillow-simd
WORKDIR $HOME
RUN ldconfig
RUN apt-get clean
RUN apt-get autoremove
RUN rm -rf /var/lib/apt/lists/* /tmp/* ~/*
Traceback (most recent call last): File "/home/10301003/anaconda3/envs/pytorch1.2/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1030, in _build_extension_module check=True) File "/home/10301003/anaconda3/envs/pytorch1.2/lib/python3.7/subprocess.py", line 512, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "convert_weight.py", line 11, in
from model import Generator, Discriminator
File "/home/10301007/stylegan2-pytorch-master/model.py", line 11, in
from op import FusedLeakyReLU, fused_leaky_relu, upfirdn2d, conv2d_gradfix
File "/home/10301007/stylegan2-pytorch-master/op/init.py", line 1, in
from .fused_act import FusedLeakyReLU, fused_leaky_relu
File "/home/10301007/stylegan2-pytorch-master/op/fused_act.py", line 15, in
os.path.join(module_path, "fused_bias_act_kernel.cu"),
File "/home/10301003/anaconda3/envs/pytorch1.2/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 661, in load
is_python_module)
File "/home/10301003/anaconda3/envs/pytorch1.2/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 830, in _jit_compile
with_cuda=with_cuda)
File "/home/10301003/anaconda3/envs/pytorch1.2/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 883, in _write_ninja_file_and_build
_build_extension_module(name, build_directory, verbose)
File "/home/10301003/anaconda3/envs/pytorch1.2/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1043, in _build_extension_module
raise RuntimeError(message)
RuntimeError: Error building extension 'fused': [1/3] /cm/shared/apps/cuda10.2/toolkit/10.2.89/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/10301003/anaconda3/envs/pytorch1.2/lib/python3.7/site-packages/torch/include -isystem /home/10301003/anaconda3/envs/pytorch1.2/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/10301003/anaconda3/envs/pytorch1.2/lib/python3.7/site-packages/torch/include/TH -isystem /home/10301003/anaconda3/envs/pytorch1.2/lib/python3.7/site-packages/torch/include/THC -isystem /cm/shared/apps/cuda10.2/toolkit/10.2.89/include -isystem /home/10301003/anaconda3/envs/pytorch1.2/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=1 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -std=c++11 -c /home/10301007/stylegan2-pytorch-master/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o
FAILED: fused_bias_act_kernel.cuda.o
/cm/shared/apps/cuda10.2/toolkit/10.2.89/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/10301003/anaconda3/envs/pytorch1.2/lib/python3.7/site-packages/torch/include -isystem /home/10301003/anaconda3/envs/pytorch1.2/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/10301003/anaconda3/envs/pytorch1.2/lib/python3.7/site-packages/torch/include/TH -isystem /home/10301003/anaconda3/envs/pytorch1.2/lib/python3.7/site-packages/torch/include/THC -isystem /cm/shared/apps/cuda10.2/toolkit/10.2.89/include -isystem /home/10301003/anaconda3/envs/pytorch1.2/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=1 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -std=c++11 -c /home/10301007/stylegan2-pytorch-master/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o
In file included from /cm/shared/apps/cuda10.2/toolkit/10.2.89/include/cuda_runtime.h:83,
from :
/cm/shared/apps/cuda10.2/toolkit/10.2.89/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
[2/3] c++ -MMD -MF fused_bias_act.o.d -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/10301003/anaconda3/envs/pytorch1.2/lib/python3.7/site-packages/torch/include -isystem /home/10301003/anaconda3/envs/pytorch1.2/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/10301003/anaconda3/envs/pytorch1.2/lib/python3.7/site-packages/torch/include/TH -isystem /home/10301003/anaconda3/envs/pytorch1.2/lib/python3.7/site-packages/torch/include/THC -isystem /cm/shared/apps/cuda10.2/toolkit/10.2.89/include -isystem /home/10301003/anaconda3/envs/pytorch1.2/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=1 -fPIC -std=c++11 -c /home/10301007/stylegan2-pytorch-master/op/fused_bias_act.cpp -o fused_bias_act.o
ninja: build stopped: subcommand failed.
I use cuda10.2,pytorch1.3.1,ninja1.10.0. How can I solve this issue?