Closed jdagdelen closed 2 years ago
I cannot say I've ever seen an internal linker error before. I'm only really guessing here, but maybe there is some sort of version conflict between your compilation tools and/or corrupted Cuda libraries? Just in case it's a transient bug, try clearing the torch extension cache and running the sample again. If the issue persists, all I can think of is reinstalling Visual C++ or Cuda toolkit, clearing the torch extension cache, and trying again. I understand this isn't much help.
Location of the extension cache may depend on your Python and PyTorch installation, but on my machine it's at %localappdata%\torch_extensions\torch_extensions\Cache\nvdiffrast_plugin
. If you have trouble locating it, see what torch.utils.cpp_extension._get_build_directory('nvdiffrast_plugin', False)
returns.
@jdagdelen Did you find a solution to this? I'm mostly curious because of the extraordinary error message.
I have this exact same issue, @s-laine did you ever find a fix for this?
My system details are: Windows 10, Visual Studio 2019, 2022, RTX 2080 Ti
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:41:10_Pacific_Daylight_Time_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
>nvidia-smi
Mon Jun 26 10:46:03 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.98 Driver Version: 535.98 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 2080 Ti WDDM | 00000000:65:00.0 On | N/A |
| 40% 43C P2 55W / 260W | 9819MiB / 11264MiB | 26% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
Traceback (most recent call last):
File "C:\Users\utkarsh\AppData\Local\miniconda3\envs\nvdiffrec\lib\site-packages\torch\utils\cpp_extension.py", line 1893, in _run_ninja_build
subprocess.run(
File "C:\Users\utkarsh\AppData\Local\miniconda3\envs\nvdiffrec\lib\subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\utkarsh\AppData\Local\miniconda3\envs\nvdiffrec\nvdiffrec\train.py", line 556, in <module>
glctx = dr.RasterizeGLContext()
File "C:\Users\utkarsh\AppData\Local\miniconda3\envs\nvdiffrec\lib\site-packages\nvdiffrast\torch\ops.py", line 221, in __init__
self.cpp_wrapper = _get_plugin(gl=True).RasterizeGLStateWrapper(output_db, mode == 'automatic', cuda_device_idx)
File "C:\Users\utkarsh\AppData\Local\miniconda3\envs\nvdiffrec\lib\site-packages\nvdiffrast\torch\ops.py", line 118, in _get_plugin
torch.utils.cpp_extension.load(name=plugin_name, sources=source_paths, extra_cflags=opts, extra_cuda_cflags=opts+['-lineinfo'], extra_ldflags=ldflags, with_cuda=True, verbose=False)
File "C:\Users\utkarsh\AppData\Local\miniconda3\envs\nvdiffrec\lib\site-packages\torch\utils\cpp_extension.py", line 1284, in load
return _jit_compile(
File "C:\Users\utkarsh\AppData\Local\miniconda3\envs\nvdiffrec\lib\site-packages\torch\utils\cpp_extension.py", line 1509, in _jit_compile
_write_ninja_file_and_build_library(
File "C:\Users\utkarsh\AppData\Local\miniconda3\envs\nvdiffrec\lib\site-packages\torch\utils\cpp_extension.py", line 1624, in _write_ninja_file_and_build_library
_run_ninja_build(
File "C:\Users\utkarsh\AppData\Local\miniconda3\envs\nvdiffrec\lib\site-packages\torch\utils\cpp_extension.py", line 1909, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'nvdiffrast_plugin_gl': [1/1] "C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.29.30133\bin\Hostx64\x64/link.exe" common.o glutil.o rasterize_gl.o torch_bindings_gl.o torch_rasterize_gl.o /nologo /DLL /LIBPATH:C:\Users\utkarsh\AppData\Local\miniconda3\envs\nvdiffrec\lib\site-packages\nvdiffrast\torch\..\lib /DEFAULTLIB:gdi32 /DEFAULTLIB:opengl32 /DEFAULTLIB:user32 /DEFAULTLIB:setgpu c10.lib c10_cuda.lib torch_cpu.lib torch_cuda.lib -INCLUDE:?warp_size@cuda@at@@YAHXZ torch.lib /LIBPATH:C:\Users\utkarsh\AppData\Local\miniconda3\envs\nvdiffrec\lib\site-packages\torch\lib torch_python.lib /LIBPATH:C:\Users\utkarsh\AppData\Local\miniconda3\envs\nvdiffrec\libs "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\lib\x64" cudart.lib /out:nvdiffrast_plugin_gl.pyd
FAILED: nvdiffrast_plugin_gl.pyd
"C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.29.30133\bin\Hostx64\x64/link.exe" common.o glutil.o rasterize_gl.o torch_bindings_gl.o torch_rasterize_gl.o /nologo /DLL /LIBPATH:C:\Users\utkarsh\AppData\Local\miniconda3\envs\nvdiffrec\lib\site-packages\nvdiffrast\torch\..\lib /DEFAULTLIB:gdi32 /DEFAULTLIB:opengl32 /DEFAULTLIB:user32 /DEFAULTLIB:setgpu c10.lib c10_cuda.lib torch_cpu.lib torch_cuda.lib -INCLUDE:?warp_size@cuda@at@@YAHXZ torch.lib /LIBPATH:C:\Users\utkarsh\AppData\Local\miniconda3\envs\nvdiffrec\lib\site-packages\torch\lib torch_python.lib /LIBPATH:C:\Users\utkarsh\AppData\Local\miniconda3\envs\nvdiffrec\libs "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\lib\x64" cudart.lib /out:nvdiffrast_plugin_gl.pyd
glutil.o : fatal error LNK1000: Internal error during IMAGE::Pass1
ninja: build stopped: subcommand failed.
No, I haven't heard anything new. The only advice I have is the same as above.
@s-laine so I figured out the issue on my end. It happened because I had multiple Visual Studio versions installed (both 2019 and 2022), and the code here was automatically detecting the 2019 path. This auto detection code doesn't work for VS2022 Pro version, since it's folder is "Pro" instead of "Professional". There's a few other issues in the path here.
This can also be solved by hard coding the system PATH to the VS2022 directory I believe, but I didn't want to do that so I just modified this code.
But I'm not sure why this was an issue in the first place. Perhaps I missed somewhere that nvdiffrast doesn't compile with VS2019?
Glad to hear you got it working. There have been some problems with VS paths before, so we should probably take a closer look at them at some point.
Nvdiffrast has been tested to work with VS2019, so that shouldn't be an issue unless some critical components were not installed. Perhaps some part of PyTorch's extension build toolchain found and used VS2022 and the rest used VS2019 — the inner workings of the extension builder are mysterious and change from version to version. Mixing and matching compilation artifacts from different VS versions could at least explain the internal error.
I'm trying to debug why I'm getting an error at the ninja build stage for running the examples. I would appreciate hearing any ideas about what could be causing it.
System: Windows 10, RTX 3090, Visual Studio 2019