baowenbo / DAIN

Depth-Aware Video Frame Interpolation (CVPR 2019)
https://sites.google.com/view/wenbobao/dain
MIT License
8.19k stars 840 forks source link

Error building cuda packages #125

Open cravisjan97 opened 3 years ago

cravisjan97 commented 3 years ago

I tried to build CUDA packages during the installation phases and I get a runtime error. Here are the commands I implemented:

cd DAIN/my_package/MinDepthFlowProjection/
rm -rf build *.egg-info dist
python setup.py install

After this, I get a runtime error:

ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/home/csundaram/anaconda3/envs/dain/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1533, in _run_ninja_build
    subprocess.run(
  File "/home/csundaram/anaconda3/envs/dain/lib/python3.8/subprocess.py", line 512, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

File "/home/csundaram/anaconda3/envs/dain/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1555, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

How do I solve this?

Note: I see that the compilation is done using D_GLIBCXX_USE_CX11_ABI=0 but I know that my machine requires D_GLIBCXX_USE_CX11_ABI=1. This might solve the issue but I don't know how to set this flag in setup.py.

Any help is appreciated. Thanks!

theaswanson commented 3 years ago

I'm getting the same issue. Running CUDA 11 on WSL 2, if it helps. I'm also getting lots of compilation errors and warnings, likely due to -std=c++11 being used in the ninja.build files rather than c++14 at least.

Edit: just saw issue #118 and realized I was using PyTorch 1.7.1 while the latest supported version for DAIN is 1.4.0. Will be compiling PyTorch 1.4.0 from source for CUDA 11 and see if that fixes anything. Might want to update the README to reflect this version requirement

flobauer commented 3 years ago

Have you been able to compile 1.4.0 with CUDA 11? I am not able to do it yet.

theflanman commented 3 years ago

I appear to have been able to get it working. Here's what I had to do, based on a few issues I read up on.

YouweiLyu commented 3 years ago

Manually setting the cxx and nvcc parameters in ./my_package/*/setup.py may help to solve this compiling problem. And one could refer to NVIDIA website to find the proper -gencode corresponding to the GPU used.

cxx_args = ['-std=c++14']
nvcc_args = [
    # '-gencode', 'arch=compute_50,code=sm_50',
    # '-gencode', 'arch=compute_52,code=sm_52',
    # '-gencode', 'arch=compute_60,code=sm_60',
    # '-gencode', 'arch=compute_61,code=sm_61'
    '-gencode', 'arch=compute_86,code=sm_86', # for RTX3090
    # '-gencode', 'arch=compute_70,code=compute_70'
]

I am using

Moreover, remember change pytorch1.0.0 to your env name in ./build.sh This way works for me. After the modification, some codes have to be adapted for pytorch 1.7.1 as well. Hope this method helpful to you. @flobauer @cravisjan97 @theflanman

Khipucamayoc commented 2 years ago

@YouweiLyu, is it too much to ask to share a walk-through on how to build/install it with CUDA 11.1? I am getting so many errors it would be useless to share them here (maybe)...

Could you tell me which are the codes that need to be adapted for PyTorch 1.7.1.?

I am using:

Ubuntu 20.04 PyTorch 1.9 & Anaconda env python=3.8 CUDA 11.1 RTX 3090

I changed the PyTorch as you recommended to my env name in ./build.sh but still does not work... I suppose it is me not knowing which are the codes to be adapted to PyTorch 1.9? As it is backwards compatible, the problem should not lay in the release I suppose...

All help is appreciated!

michaelmaverick commented 2 years ago

@YouweiLyu, is it too much to ask to share a walk-through on how to build/install it with CUDA 11.1? I am getting so many errors it would be useless to share them here (maybe)...

Could you tell me which are the codes that need to be adapted for PyTorch 1.7.1.?

I am using:

Ubuntu 20.04 PyTorch 1.9 & Anaconda env python=3.8 CUDA 11.1 RTX 3090

I changed the PyTorch as you recommended to my env name in ./build.sh but still does not work... I suppose it is me not knowing which are the codes to be adapted to PyTorch 1.9? As it is backwards compatible, the problem should not lay in the release I suppose...

All help is appreciated! This should work. compiler_args.txt

Khipucamayoc commented 2 years ago

Hey, thanks for that @michaelmaverick! But it is not the problem, as I had already changed what it states in your .txt file... And still nothing. @YouweiLyu mentioned that "After the modification, some codes have to be adapted for pytorch 1.7.1"... is that the only thing that one supposedly should change then? @michaelmaverick do you have it running? If you have the time, a walkthrough would be extremely appreciated, as I think the issue must lay somewhere else. Thanks beforehand!

laomao0 commented 2 years ago

If you do not want to build CUDA programs. We provide the CUPY version of those packages. The cupy files do not need to be built. please refer to: https://github.com/laomao0/cupy_packages