Open TRS07170 opened 1 year ago
How did you install PyTorch3D? Can you start again in a new conda environment following the recommended steps (https://github.com/facebookresearch/pytorch3d/blob/main/INSTALL.md) to install a prebuilt conda package?
I'm getting the same error. I'm installing torch (2.0.1, cu118) and pytorch3d in a Docker container,. CUDA is definitely available in Pytorch in the container. If I set FORCE_CUDA when building the image, I get an ¨Unknown CUDA error". It works after the image was built if I start the container and run the same pip command with FORCE_CUDA set in the shell.
This is likely due to the fact that a GPU is not available in build time, but I can't yet explain the other error I'm getting when installing Pytorch3D:
#0 32.77 self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
#0 32.77 File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 581, in unix_wrap_single_compile
#0 32.77 cflags = unix_cuda_flags(cflags)
#0 32.77 File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 548, in unix_cuda_flags
#0 32.77 cflags + _get_cuda_arch_flags(cflags))
#0 32.77 File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1773, in _get_cuda_arch_flags
#0 32.77 arch_list[-1] += '+PTX'
#0 32.77 IndexError: list index out of range
#0 32.77 [end of output]
#0 32.77
#0 32.77 note: This error originates from a subprocess, and is likely not a problem with pip.
#0 32.77 ERROR: Failed building wheel for pytorch3d
#0 32.77 Running setup.py clean for pytorch3d
#0 33.93 Failed to build pytorch3d
#0 33.93 ERROR: Could not build wheels for pytorch3d, which is required to install pyproject.toml-based projects
Ok, it looks that this is due to Pytorch not being able to set arch_list
if a GPU available in build time, which in turn affects Pytorch3D's installation.
I suppose one could manually set the env var as described here. Also, see this.
Since I'm using this to set up my dev environment with vscode devcontainers, I guess another way out would be to add the installation of torch and pytorch3d to a post start hook.
I got it to work by manually setting the env var TORCH_CUDA_ARCH_LIST="Turing Ampere Ada Hopper"
before installing torch
and pytorch3d
in the Dockerfile. Even though a GPU is not available in build time, and Pytorch
is not able to populate the arch_list
variable here, it will use the values defined in the env var and the problem goes away. For more details on how the arch_list
variable is populated, see this.
@TRS07170, hopefully this can be of help. Let me know if there is any other information I can provide.
Hi, I'm facing the same problem:
Traceback (most recent call last):
File "demo.py", line 160, in <module>
rgb, depth = render_mesh_orthogonal(mesh, face, render_cam_params, (img_height,img_width), h)
File "/home/federico/repos/InterWild/main/../common/utils/vis.py", line 168, in render_mesh_orthogonal
images, fragments = renderer(mesh, materials=materials)
File "/home/federico/repos/InterWild_venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/federico/repos/InterWild_venv/lib/python3.8/site-packages/pytorch3d/renderer/mesh/renderer.py", line 107, in forward
fragments = self.rasterizer(meshes_world, **kwargs)
File "/home/federico/repos/InterWild_venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/federico/repos/InterWild_venv/lib/python3.8/site-packages/pytorch3d/renderer/mesh/rasterizer.py", line 252, in forward
pix_to_face, zbuf, bary_coords, dists = rasterize_meshes(
File "/home/federico/repos/InterWild_venv/lib/python3.8/site-packages/pytorch3d/renderer/mesh/rasterize_meshes.py", line 223, in rasterize_meshes
pix_to_face, zbuf, barycentric_coords, dists = _RasterizeFaceVerts.apply(
File "/home/federico/repos/InterWild_venv/lib/python3.8/site-packages/pytorch3d/renderer/mesh/rasterize_meshes.py", line 297, in forward
pix_to_face, zbuf, barycentric_coords, dists = _C.rasterize_meshes(
RuntimeError: Not compiled with GPU support
Exception raised from RasterizeMeshesCoarse at /tmp/pip-req-build-1iudualq/pytorch3d/csrc/rasterize_meshes/rasterize_meshes.h:306 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x3e (0x7f2127d151ee in /home/federico/repos/InterWild_venv/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, char const*) + 0x60 (0x7f2127cf06a9 in /home/federico/repos/InterWild_venv/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #2: RasterizeMeshesCoarse(at::Tensor const&, at::Tensor const&, at::Tensor const&, std::tuple<int, int>, float, int, int) + 0x83 (0x7f20e3ced873 in /home/federico/repos/InterWild_venv/lib/python3.8/site-packages/pytorch3d/_C.cpython-38-x86_64-linux-gnu.so)
frame #3: RasterizeMeshes(at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, std::tuple<int, int>, float, int, int, int, bool, bool, bool) + 0x85 (0x7f20e3cee2f5 in /home/federico/repos/InterWild_venv/lib/python3.8/site-packages/pytorch3d/_C.cpython-38-x86_64-linux-gnu.so)
frame #4: <unknown function> + 0x41adb (0x7f20e3d12adb in /home/federico/repos/InterWild_venv/lib/python3.8/site-packages/pytorch3d/_C.cpython-38-x86_64-linux-gnu.so)
frame #5: <unknown function> + 0x357c3 (0x7f20e3d067c3 in /home/federico/repos/InterWild_venv/lib/python3.8/site-packages/pytorch3d/_C.cpython-38-x86_64-linux-gnu.so)
<omitting python frames>
frame #12: THPFunction_apply(_object*, _object*) + 0x5f6 (0x7f2177201266 in /home/federico/repos/InterWild_venv/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #21: python() [0x50b17c]
frame #26: python() [0x59d29f]
frame #31: python() [0x50b17c]
frame #36: python() [0x59d29f]
frame #43: python() [0x67dbf1]
frame #44: python() [0x67dc6f]
frame #45: python() [0x67dd11]
frame #49: __libc_start_main + 0xf3 (0x7f21bea9d083 in /lib/x86_64-linux-gnu/libc.so.6)
I tried with the solution proposed by @edufschmidt, setting TORCH_CUDA_ARCH_LIST="Pascal"
, since I'm using a NVIDIA GTX 1080
gpu. I also tried different versions of gcc, but I always get the same error. I'm using torch 1.12.0+cu116
.
Thanks in advance for the support!
@edufschmidt Thank you for your response! Though it seems like mine was not this complicated.
@fedeceola I followed @bottler's advice and re-installed pytorch3d in a new virtual environment with FORCE_CUDA=1
. The reason why the first time did work was probably because I first installed pytorch3d without specifying FORCE_CUDA=1
, and then installed it again, in the same environment with FORCE_CUDA=1
. What I would recommend is to start with a new environment and make sure you specify FORCE_CUDA=1
the first time you install pytorch3d.
What I would recommend is to start with a new environment and make sure you specify FORCE_CUDA=1 the first time you install pytorch3d.
+1 to that, although you might also want to set TORCH_CUDA_ARCH_LIST
. also, not sure if it's something you're open to, but I've been using vscode's dev containers (along with nvidia-docker) for ensuring that my environment is reproducible. there might be downsides to this approach, but it's been pretty solid so far.
Thanks @TRS07170, @edufschmidt. I re-installed pytorch3d in a new environment and now it is working!
I'm facing the following RuntimeError when trying to run code using Pytorch3D:
I tried to re-install pytorch3d with
FORCE_CUDA=1
as proposed in https://github.com/facebookresearch/pytorch3d/issues/1161, and I also setCUDA_HOME=/usr/local/cuda-11.7
, but none of them worked. In pytorch, it said my CUDA is available:I wonder what could possibly goes wrong? or is there any other possible solutions to this problem?