NVlabs / nvdiffrast

Nvdiffrast - Modular Primitives for High-Performance Differentiable Rendering
Other
1.29k stars 139 forks source link

An error when running RasterizeCudaContext() #120

Closed L1onKing closed 11 months ago

L1onKing commented 1 year ago

Hello! I have installed nvdiffrast pretty much successfuly, no issues along the way:

nvdiffrast>pip install .
WARNING: Ignoring invalid distribution -rotobuf (d:\work\pythonenvironments\working_env\lib\site-packages)
WARNING: Ignoring invalid distribution -rotobuf (d:\work\pythonenvironments\working_env\lib\site-packages)
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Processing d:\work\nvdiffrast
  Preparing metadata (setup.py) ... done
Requirement already satisfied: numpy in d:\work\pythonenvironments\working_env\lib\site-packages (from nvdiffrast==0.3.0) (1.23.4)
Building wheels for collected packages: nvdiffrast
  Building wheel for nvdiffrast (setup.py) ... done
  Created wheel for nvdiffrast: filename=nvdiffrast-0.3.0-py3-none-any.whl size=141050 sha256=14e1b37b5c54418187ddc0aebb0b53abf30d18753ce00d0dd473e23d5064de4f
  Stored in directory: C:\Users\Yevhen\AppData\Local\Temp\pip-ephem-wheel-cache-cqao50id\wheels\36\2b\e4\974b79cf6e99e04170eec5496204a5838e9b51d83b20cc75f1
Successfully built nvdiffrast
WARNING: Ignoring invalid distribution -rotobuf (d:\work\pythonenvironments\working_env\lib\site-packages)
Installing collected packages: nvdiffrast
WARNING: Ignoring invalid distribution -rotobuf (d:\work\pythonenvironments\working_env\lib\site-packages)
Successfully installed nvdiffrast-0.3.0
WARNING: Ignoring invalid distribution -rotobuf (d:\work\pythonenvironments\working_env\lib\site-packages)
WARNING: Ignoring invalid distribution -rotobuf (d:\work\pythonenvironments\working_env\lib\site-packages)
WARNING: Ignoring invalid distribution -rotobuf (d:\work\pythonenvironments\working_env\lib\site-packages)
WARNING: You are using pip version 21.3.1; however, version 23.1.2 is available.
You should consider upgrading via the 'D:\Work\PythonEnvironments\working_env\Scripts\python.exe -m pip install --upgrade pip' command.

But when I tried to run a test script, I have an error at the line:

import cv2
import numpy as np
import torch
import nvdiffrast.torch as dr
import sys

def tensor(*args, **kwargs):
    return torch.tensor(*args, device='cuda', **kwargs)

if __name__ == '__main__':
    glctx = dr.RasterizeCudaContext()     # THIS LINE

The error is next:

File "D:\Work\PythonEnvironments\working_env\lib\site-packages\torch\utils\cpp_extension.py", line 1624, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "D:\Work\PythonEnvironments\working_env\lib\site-packages\torch\utils\cpp_extension.py", line 1917, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'nvdiffrast_plugin': [1/1] "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.33.31629\bin\Hostx64\x64/link.exe" Buffer.o CudaRaster.o RasterImpl.cuda.o RasterImpl.o common.o rasterize.cuda.o interpolate.cuda.o texture.cuda.o texture.o antialias.cuda.o torch_bindings.o torch_rasterize.o torch_interpolate.o torch_texture.o torch_antialias.o /nologo /DLL c10.lib c10_cuda.lib torch_cpu.lib torch_cuda_cu.lib -INCLUDE:?_torch_cuda_cu_linker_symbol_op_cuda@native@at@@YA?AVTensor@2@AEBV32@@Z torch_cuda_cpp.lib -INCLUDE:?warp_size@cuda@at@@YAHXZ torch.lib /LIBPATH:D:\Work\PythonEnvironments\working_env\lib\site-packages\torch\lib torch_python.lib /LIBPATH:D:\Work\PythonEnvironments\working_env\Scripts\libs "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\lib\x64" cudart.lib /out:nvdiffrast_plugin.pyd
FAILED: nvdiffrast_plugin.pyd 
"C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.33.31629\bin\Hostx64\x64/link.exe" Buffer.o CudaRaster.o RasterImpl.cuda.o RasterImpl.o common.o rasterize.cuda.o interpolate.cuda.o texture.cuda.o texture.o antialias.cuda.o torch_bindings.o torch_rasterize.o torch_interpolate.o torch_texture.o torch_antialias.o /nologo /DLL c10.lib c10_cuda.lib torch_cpu.lib torch_cuda_cu.lib -INCLUDE:?_torch_cuda_cu_linker_symbol_op_cuda@native@at@@YA?AVTensor@2@AEBV32@@Z torch_cuda_cpp.lib -INCLUDE:?warp_size@cuda@at@@YAHXZ torch.lib /LIBPATH:D:\Work\PythonEnvironments\working_env\lib\site-packages\torch\lib torch_python.lib /LIBPATH:D:\Work\PythonEnvironments\working_env\Scripts\libs "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\lib\x64" cudart.lib /out:nvdiffrast_plugin.pyd

Could you please give me an advice how to fix an issue? I think it is obvious that the problem is in CUDA. But what is the problem exactly, wrong version perhaps?

I have installed Cuda and I am using it with PyTorch quite successfuly, so I don't think it is a question of "correct" installation. My CUDA version is 11.8

L1onKing commented 1 year ago

I think I have spotted an issue

"C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.33.31629\bin\Hostx64\x64/link.exe" Buffer.o CudaRaster.o RasterImpl.cuda.o RasterImpl.o common.o rasterize.cuda.o interpolate.cuda.o texture.cuda.o texture.o antialias.cuda.o torch_bindings.o torch_rasterize.o torch_interpolate.o torch_texture.o torch_antialias.o /nologo /DLL c10.lib c10_cuda.lib torch_cpu.lib torch_cuda_cu.lib -INCLUDE:?_torch_cuda_cu_linker_symbol_op_cuda@native@at@@YA?AVTensor@2@AEBV32@@Z torch_cuda_cpp.lib -INCLUDE:?warp_size@cuda@at@@YAHXZ torch.lib /LIBPATH:D:\Work\PythonEnvironments\working_env\lib\site-packages\torch\lib torch_python.lib /LIBPATH:D:\Work\PythonEnvironments\working_env\Scripts\libs "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\lib\x64" cudart.lib /out:nvdiffrast_plugin.pyd
LINK : fatal error LNK1104: cannot open file 'python38.lib'

It seems there is no correct path to python38.lib. My python38.lib is here:

C:\Users\username\AppData\Local\Programs\Python\Python38\libs

Can you advise how can I point that path into the command? Thanks!

s-laine commented 1 year ago

This looks like a problem with PyTorch's extension build mechanism. When building the CUDA/C++ extension, nvdiffrast only supplies the names of the code-containing source files to PyTorch and has it execute the build. I unfortunately don't know why PyTorch doesn't add the relevant library path to the command line of the linker, or if it can be instructed to do that somehow.

This could be some sort of version conflict as you suspect. Because PyTorch doesn't point the linker to the correct Python library, I'm guessing the PyTorch version you have installed is not fully compatible with the Python version.

As a crude first test, you could try copying the missing library file into the library directory that is specified in the command line. However, the linker error is probably a symptom of a deeper problem, so I'd be surprised if this fixed it.

nsarafianos commented 11 months ago

@s-laine

Thanks for all your insights. Just wanted to follow up that interestingly enough "just" copying the libraries from the AppData path to the lib folder of Visual Studio (I'm using 2022 Professional) worked just fine.

s-laine commented 11 months ago

Great to hear you found a workaround. Hopefully this isn't too common a problem — there are so many ways people's environments can differ that there always seem to be new ways for things to fail.

Closing for now, but will reopen for further troubleshooting if there are more reports of the same issue.