alexandrosstergiou / SoftPool

[ICCV 2021] Code for approximated exponential maximum pooling
MIT License
288 stars 52 forks source link

Hello,what is this "RuntimeError: output_grad.is_contiguous() INTERNAL ASSERT FAILED "? How can I fix the bug? #18

Closed Jackyinuo closed 3 years ago

Jackyinuo commented 3 years ago

when I input "softpool" in my own CNN, this bug happens

RuntimeError: output_grad.is_contiguous() INTERNAL ASSERT FAILED at "E:\SoftPool-master\pytorch\CUDA\softpool_cuda.cpp":119, please report a bug to PyTorch. output_grad must be a contiguous tensor

alexandrosstergiou commented 3 years ago

Hi @Jackyinuo ,

This issue is a duplicate of #16. You can either add the code snippet from the comment or re-build the project with the latest commit.

Best, Alex

Jackyinuo commented 3 years ago

@alexandrosstergiou HI alexandrosstergiou

when I rebuild the project, I get this warning. "C:\Users\admin.conda\envs\pt1\lib\site-packages\torch\utils\cpp_extension.py:274: UserWarning: Error checking compiler version for cl: [WinError 2] 系统找不到指定的文件。 warnings.warn('Error checking compiler version for {}: {}'.format(compiler, error))", is it a matter?

alexandrosstergiou commented 3 years ago

I have not tried to install the project on Windows, but from the message, it sounds like your system was not able to find a compiler. Have you installed Visual Studio with "Build Tools for Visual Studio"? If not, do so and try to re-build the project.

Best, Alex

Jackyinuo commented 3 years ago

@alexandrosstergiou ok,thanks. when I build on ubuntu"make instalL", " /bin/sh: 1: :/usr/local/cuda/bin/nvcc: not found" happens, I also try vim ~/.bashrc, but the same happens. export PATH=/usr/local/cuda/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

alexandrosstergiou commented 3 years ago

I assume that you have reloaded your bashrc settings afterward (i.e. $source ~/.bashrc)? If so, do you get the nvcc version with nvcc --version? In general, I would just suggest re-installing CUDA, CUDA toolkit etc.

Jackyinuo commented 3 years ago

@alexandrosstergiou
l try the steps including "source ~/.bashrc","nvcc --version",and "make instaell", but the problem happens. the pytorch and cuda version are 1.7.0, 11.0 respectively. $ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2020 NVIDIA Corporation Built on Thu_Jun_11_22:26:38_PDT_2020 Cuda compilation tools, release 11.0, V11.0.194 Build cuda_11.0_bu.TC445_37.28540450_0

$ make install
rm -rf *.egg-info rm -rf build dist python setup.py install running install running bdist_egg running egg_info creating SoftPool.egg-info writing SoftPool.egg-info/PKG-INFO writing dependency_links to SoftPool.egg-info/dependency_links.txt writing top-level names to SoftPool.egg-info/top_level.txt writing manifest file 'SoftPool.egg-info/SOURCES.txt' reading manifest file 'SoftPool.egg-info/SOURCES.txt' writing manifest file 'SoftPool.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib running build_py creating build creating build/lib.linux-x86_64-3.8 creating build/lib.linux-x86_64-3.8/SoftPool copying SoftPool/idea.py -> build/lib.linux-x86_64-3.8/SoftPool copying SoftPool/init.py -> build/lib.linux-x86_64-3.8/SoftPool running build_ext building 'softpool_cuda' extension creating /disk1/huihui/SoftPool-master/pytorch/build/temp.linux-x86_64-3.8 creating /disk1/huihui/SoftPool-master/pytorch/build/temp.linux-x86_64-3.8/CUDA Emitting ninja build file /disk1/huihui/SoftPool-master/pytorch/build/temp.linux-x86_64-3.8/build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/2] :/usr/local/cuda:/usr/local/cuda/bin/nvcc -I/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/include -I/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/include/TH -I/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/include/THC -I:/usr/local/cuda:/usr/local/cuda/include -I/home/phzhou/anaconda3/envs/pt1/include/python3.8 -c -c /disk1/huihui/SoftPool-master/pytorch/CUDA/softpool_cuda_kernel.cu -o /disk1/huihui/SoftPool-master/pytorch/build/temp.linux-x86_64-3.8/CUDA/softpool_cuda_kernel.o -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=softpool_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_70,code=sm_70 -std=c++14 FAILED: /disk1/huihui/SoftPool-master/pytorch/build/temp.linux-x86_64-3.8/CUDA/softpool_cuda_kernel.o :/usr/local/cuda:/usr/local/cuda/bin/nvcc -I/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/include -I/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/include/TH -I/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/include/THC -I:/usr/local/cuda:/usr/local/cuda/include -I/home/phzhou/anaconda3/envs/pt1/include/python3.8 -c -c /disk1/huihui/SoftPool-master/pytorch/CUDA/softpool_cuda_kernel.cu -o /disk1/huihui/SoftPool-master/pytorch/build/temp.linux-x86_64-3.8/CUDA/softpool_cuda_kernel.o -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=softpool_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_70,code=sm_70 -std=c++14 /bin/sh: 1: :/usr/local/cuda:/usr/local/cuda/bin/nvcc: not found [2/2] c++ -MMD -MF /disk1/huihui/SoftPool-master/pytorch/build/temp.linux-x86_64-3.8/CUDA/softpool_cuda.o.d -pthread -B /home/phzhou/anaconda3/envs/pt1/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/include -I/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/include/TH -I/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/include/THC -I:/usr/local/cuda:/usr/local/cuda/include -I/home/phzhou/anaconda3/envs/pt1/include/python3.8 -c -c /disk1/huihui/SoftPool-master/pytorch/CUDA/softpool_cuda.cpp -o /disk1/huihui/SoftPool-master/pytorch/build/temp.linux-x86_64-3.8/CUDA/softpool_cuda.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=softpool_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14 cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ In file included from /home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/include/ATen/Parallel.h:149:0, from /home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/utils.h:3, from /home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/nn/cloneable.h:5, from /home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/nn.h:3, from /home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/all.h:12, from /home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/include/torch/extension.h:4, from /disk1/huihui/SoftPool-master/pytorch/CUDA/softpool_cuda.cpp:1: /home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/include/ATen/ParallelOpenMP.h:84:0: warning: ignoring #pragma omp parallel [-Wunknown-pragmas]

pragma omp parallel for if ((end - begin) >= grain_size)

^ ninja: build stopped: subcommand failed. Traceback (most recent call last): File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1516, in _run_ninja_build subprocess.run( File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/subprocess.py", line 512, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "setup.py", line 4, in setup( File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/setuptools/init.py", line 153, in setup return distutils.core.setup(**attrs) File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/distutils/core.py", line 148, in setup dist.run_commands() File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/distutils/dist.py", line 966, in run_commands self.run_command(cmd) File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/distutils/dist.py", line 985, in run_command cmd_obj.run() File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/setuptools/command/install.py", line 67, in run self.do_egg_install() File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/setuptools/command/install.py", line 109, in do_egg_install self.run_command('bdist_egg') File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/distutils/dist.py", line 985, in run_command cmd_obj.run() File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/setuptools/command/bdist_egg.py", line 167, in run cmd = self.call_command('install_lib', warn_dir=0) File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/setuptools/command/bdist_egg.py", line 153, in call_command self.run_command(cmdname) File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/distutils/dist.py", line 985, in run_command cmd_obj.run() File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/setuptools/command/install_lib.py", line 11, in run self.build() File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/distutils/command/install_lib.py", line 107, in build self.run_command('build_ext') File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/distutils/dist.py", line 985, in run_command cmd_obj.run() File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 79, in run _build_ext.run(self) File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run _build_ext.build_ext.run(self) File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/distutils/command/build_ext.py", line 340, in run self.build_extensions() File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 653, in build_extensions build_ext.build_extensions(self) File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions _build_ext.build_ext.build_extensions(self) File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/distutils/command/build_ext.py", line 449, in build_extensions self._build_extensions_serial() File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/distutils/command/build_ext.py", line 474, in _build_extensions_serial self.build_extension(ext) File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 196, in build_extension _build_ext.build_extension(self, ext) File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/distutils/command/build_ext.py", line 528, in build_extension objects = self.compiler.compile(sources, File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 473, in unix_wrap_ninja_compile _write_ninja_file_and_compile_objects( File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1233, in _write_ninja_file_and_compile_objects _run_ninja_build( File "/home/phzhou/anaconda3/envs/pt1/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1538, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error compiling objects for extension Makefile:2: recipe for target 'install' failed make: *** [install] Error 1

alexandrosstergiou commented 3 years ago

I see that you are using anaconda. Did you also install Pytorch from conda (i.e. $ conda install pytorch torchvision torchaudio cudatoolkit=11.0 -c pytorch)? Because I have also encountered building problems as well recently if cudatoolkit is on a conda environment.

Best, Alex

Jackyinuo commented 3 years ago

@alexandrosstergiou Hi, alexandrosstergiou. i sure i use the conda to install pytorch etc. should i install pip install pytorch and cudatoolkit?

alexandrosstergiou commented 3 years ago

I believe your problem is that you are using the environment version of cudatoolkit rather than the one (if) installed on your computer. If possible, remove cudatoolkit from your conda environment. If you are still having problems you should probably re-do the install procedure for the cuda toolkit [link], cuDNN [link] and add them to your ~./bashrc file i.e. :

$ echo 'export LD_LIBRARY_PATH=/usr/lib/cuda/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
$ echo 'export LD_LIBRARY_PATH=/usr/lib/cuda/include:$LD_LIBRARY_PATH' >> ~/.bashrc

Best, Alex

alexandrosstergiou commented 3 years ago

Closing dues to inactivity. Feel free to open a new issue if there is a new problem.

Best, Alex