zhanghang1989 / PyTorch-Encoding

A CV toolkit for my papers.
https://hangzhang.org/PyTorch-Encoding/
MIT License
2.04k stars 450 forks source link

Error in Building the Docker Container #419

Open pramitd opened 1 year ago

pramitd commented 1 year ago

Hi, Thank you for the codebase. I am trying to run the model (https://github.com/isl-org/lang-seg) which requires torch-encoding. For my research I need to build a singularity container from a docker image and add any further dependencies to the singularity.

Currently, to ensure that I have correct environment I decided to compile the docker provided here and then convert it to the singularity.

When running bash scripts/build_docker.sh , I get the following error:

ninja: build stopped: subcommand failed. Traceback (most recent call last): File "/opt/conda/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1471, in _run_ninja_build check=True) File "/opt/conda/lib/python3.6/subprocess.py", line 438, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred: Traceback (most recent call last): File "setup.py", line 125, in cmdclass={"build_ext": torch.utils.cpp_extension.BuildExtension}, File "/opt/conda/lib/python3.6/site-packages/setuptools/init.py", line 161, in setup return distutils.core.setup(**attrs) File "/opt/conda/lib/python3.6/distutils/core.py", line 148, in setup dist.run_commands() File "/opt/conda/lib/python3.6/distutils/dist.py", line 955, in run_commands self.run_command(cmd) File "/opt/conda/lib/python3.6/distutils/dist.py", line 974, in run_command cmd_obj.run() File "/opt/conda/lib/python3.6/site-packages/setuptools/command/develop.py", line 38, in run self.install_for_development() File "/opt/conda/lib/python3.6/site-packages/setuptools/command/develop.py", line 140, in install_for_development self.run_command('build_ext') File "/opt/conda/lib/python3.6/distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/opt/conda/lib/python3.6/distutils/dist.py", line 974, in run_command cmd_obj.run() File "/opt/conda/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 87, in run _build_ext.run(self) File "/opt/conda/lib/python3.6/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run _build_ext.build_ext.run(self) File "/opt/conda/lib/python3.6/distutils/command/build_ext.py", line 339, in run self.build_extensions() File "/opt/conda/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 633, in build_extensions build_ext.build_extensions(self) File "/opt/conda/lib/python3.6/site-packages/Cython/Distutils/old_build_ext.py", line 194, in build_extensions self.build_extension(ext) File "/opt/conda/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 208, in build_extension _build_ext.build_extension(self, ext) File "/opt/conda/lib/python3.6/distutils/command/build_ext.py", line 533, in build_extension depends=ext.depends) File "/opt/conda/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 462, in unix_wrap_ninja_compile with_cuda=with_cuda) File "/opt/conda/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1211, in _write_ninja_file_and_compile_objects error_prefix='Error compiling objects for extension') File "/opt/conda/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1484, in _run_ninja_build raise RuntimeError(message) RuntimeError: Error compiling objects for extension

Any clue as to what am I missing here. Should I have some library on my system to build the docker ? I am stuck on this issue for more than a week!

Looking forward to your reply. Best Regards, Pramit

zixinglin07 commented 1 year ago

I have the same exact issue on docker as well, also coming here from lang-seg to try to get it to work. I've also tried regular pip install but "import encoding" would always generate an error. Hope this gets attention soon!

zhanghang1989 commented 1 year ago

Is the docker in torch-encoding working? https://hangzhang.org/PyTorch-Encoding/notes/compile.html#using-docker

zhanghang1989 commented 1 year ago

For issue related to lang-seg, it would be great to reach out to the original authors

pramitd commented 1 year ago

Hi, Error posted above is coming during building the docker file in this repository and not lang-seg.

dongjunhwang commented 1 year ago

In my case, I solve this error by running the docker container using a command (not use run_docker.sh) and installing torch-encoding.

docker run -it -d -P --gpus all --name encoding nvcr.io/nvidia/pytorch:20.06-py3
docker exec -it encoding bash
pip install torch-encoding

Also, this error can occur when you use pip install git+https://github.com/zhanghang1989/PyTorch-Encoding/. So you should use pip install torch-encoding instead.