hirotomusiker / CLRerNet

The official implementation of "CLRerNet: Improving Confidence of Lane Detection with LaneIoU"
Apache License 2.0
181 stars 19 forks source link

Avoid `nms` build error due to unspecified `TORCH_CUDA_ARCH_LIST` #2

Closed PINTO0309 closed 1 year ago

PINTO0309 commented 1 year ago

Thank you for publishing this excellent paper implementation.

Avoid nms build error due to unspecified TORCH_CUDA_ARCH_LIST If not specified, the following error is likely to occur when running docker-compose build or docker compose build.

#25 11.14 Traceback (most recent call last):
#25 11.14   File "/tmp/nms/setup.py", line 5, in <module>
#25 11.14     setup(
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/__init__.py", line 107, in setup
#25 11.14     return distutils.core.setup(**attrs)
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
#25 11.14     return run_commands(dist)
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
#25 11.14     dist.run_commands()
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
#25 11.14     self.run_command(cmd)
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/dist.py", line 1234, in run_command
#25 11.14     super().run_command(command)
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
#25 11.14     cmd_obj.run()
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/command/install.py", line 80, in run
#25 11.14     self.do_egg_install()
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/command/install.py", line 129, in do_egg_install
#25 11.14     self.run_command('bdist_egg')
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
#25 11.14     self.distribution.run_command(command)
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/dist.py", line 1234, in run_command
#25 11.14     super().run_command(command)
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
#25 11.14     cmd_obj.run()
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/command/bdist_egg.py", line 164, in run
#25 11.14     cmd = self.call_command('install_lib', warn_dir=0)
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/command/bdist_egg.py", line 150, in call_command
#25 11.14     self.run_command(cmdname)
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
#25 11.14     self.distribution.run_command(command)
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/dist.py", line 1234, in run_command
#25 11.14     super().run_command(command)
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
#25 11.14     cmd_obj.run()
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/command/install_lib.py", line 11, in run
#25 11.14     self.build()
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/_distutils/command/install_lib.py", line 111, in build
#25 11.14     self.run_command('build_ext')
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
#25 11.14     self.distribution.run_command(command)
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/dist.py", line 1234, in run_command
#25 11.14     super().run_command(command)
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
#25 11.14     cmd_obj.run()
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 84, in run
#25 11.14     _build_ext.run(self)
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
#25 11.14     self.build_extensions()
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 765, in build_extensions
#25 11.14     build_ext.build_extensions(self)
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions
#25 11.14     self._build_extensions_serial()
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial
#25 11.14     self.build_extension(ext)
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 246, in build_extension
#25 11.14     _build_ext.build_extension(self, ext)
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/Cython/Distutils/build_ext.py", line 127, in build_extension
#25 11.14     super(build_ext, self).build_extension(ext)
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 548, in build_extension
#25 11.14     objects = self.compiler.compile(
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/setuptools/_distutils/ccompiler.py", line 600, in compile
#25 11.14     self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 513, in unix_wrap_single_compile
#25 11.14     cflags = unix_cuda_flags(cflags)
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 480, in unix_cuda_flags
#25 11.14     cflags + _get_cuda_arch_flags(cflags))
#25 11.14   File "/home/docker/.pyenv/versions/3.8.4/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1694, in _get_cuda_arch_flags
#25 11.14     arch_list[-1] += '+PTX'
#25 11.14 IndexError: list index out of range
------
failed to solve: rpc error: code = Unknown desc = process "/bin/sh -c python /tmp/nms/setup.py install" did not complete successfully: exit code: 1

This allows each developer to change the specification according to the Arch of GPU they have.

ARG TORCH_CUDA_ARCH_LIST=7.5;8.0;8.6

My environment.

$ nvcc --version                                                                                                                                                                         

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

$ nvidia-smi

Tue Jul 18 20:48:58 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.125.06   Driver Version: 525.125.06   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
|  0%   52C    P8    21W / 220W |    770MiB /  8192MiB |     10%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1880      G   /usr/lib/xorg/Xorg                 53MiB |
|    0   N/A  N/A      2575      G   /usr/lib/xorg/Xorg                342MiB |
|    0   N/A  N/A      2745      G   /usr/bin/gnome-shell               52MiB |
|    0   N/A  N/A      6504      G   ...460863084046030754,262144      125MiB |
|    0   N/A  N/A     59001      G   ...veSuggestionsOnlyOnDemand       69MiB |
|    0   N/A  N/A    335275      G   ...RendererForSitePerProcess      113MiB |
+-----------------------------------------------------------------------------+

$ docker compose version

Docker Compose version v2.2.3

Inference results image

hirotomusiker commented 1 year ago

Thanks for the PR! Please let me check it on our environments.

hirotomusiker commented 1 year ago

It looks like the error above does not occur on mmcv installation but on nms module installation.

FYI: https://github.com/hirotomusiker/CLRerNet/blob/main/docs/INSTALL.md#docker-compose

I'm currently checking whether nms installation gets successful with TORCH_CUDA_ARCH_LIST specified.

hirotomusiker commented 1 year ago

nms is successfully installed by specifying TORCH_CUDA_ARCH_LIST. Thank you!

hirotomusiker commented 1 year ago

LGTM!