Linker issue when building the raymarching extensions

nickludlam commented 1 year ago

I've got an issue when I try to build the raymarching module, whereas all the others build cleanly.

The exact output I see is here:

"C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.32.31326\bin\HostX86\x64\link.exe" /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Anaconda3\envs\dream\lib\site-packages\torch\lib "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\lib/x64" /LIBPATH:C:\Anaconda3\envs\dream\libs /LIBPATH:C:\Anaconda3\envs\dream /LIBPATH:C:\Anaconda3\envs\dream\PCbuild\amd64 "/LIBPATH:C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.32.31326\ATLMFC\lib\x64" "/LIBPATH:C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.32.31326\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\lib\um\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\\lib\10.0.19041.0\\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib cudart.lib c10_cuda.lib torch_cuda_cu.lib torch_cuda_cpp.lib /EXPORT:PyInit__raymarching D:\Work\ML\3070\stable-dreamfusion\raymarching\build\temp.win-amd64-cpython-39\Release\Work\ML\3070\stable-dreamfusion\raymarching\src\bindings.obj D:\Work\ML\3070\stable-dreamfusion\raymarching\build\temp.win-amd64-cpython-39\Release\Work\ML\3070\stable-dreamfusion\raymarching\src\raymarching.obj /OUT:build\lib.win-amd64-cpython-39\_raymarching.cp39-win_amd64.pyd /IMPLIB:D:\Work\ML\3070\stable-dreamfusion\raymarching\build\temp.win-amd64-cpython-39\Release\Work\ML\3070\stable-dreamfusion\raymarching\src\_raymarching.cp39-win_amd64.lib
   Creating library D:\Work\ML\3070\stable-dreamfusion\raymarching\build\temp.win-amd64-cpython-39\Release\Work\ML\3070\stable-dreamfusion\raymarching\src\_raymarching.cp39-win_amd64.lib and object D:\Work\ML\3070\stable-dreamfusion\raymarching\build\temp.win-amd64-cpython-39\Release\Work\ML\3070\stable-dreamfusion\raymarching\src\_raymarching.cp39-win_amd64.exp
bindings.obj : error LNK2001: unresolved external symbol "void __cdecl composite_rays(unsigned int,unsigned int,float,class at::Tensor,class at::Tensor,class at::Tensor,class at::Tensor,class at::Tensor,class at::Tensor,class at::Tensor,class at::Tensor)" (?composite_rays@@YAXIIMVTensor@at@@0000000@Z)
  Hint on symbols that are defined and could potentially match:
    "void __cdecl composite_rays(unsigned int,unsigned int,float,class at::Tensor,class at::Tensor,class at::Tensor,class at::Tensor,class at::Tensor,class at::Tensor,class at::Tensor,class at::Tensor)" (?composite_rays@@YAXIIMVTensor@at@@0V12@11000@Z)
build\lib.win-amd64-cpython-39\_raymarching.cp39-win_amd64.pyd : fatal error LNK1120: 1 unresolved externals

The main issue looks to be related to the composite_rays() method. When it's linking the object file, the namespace suffix looks to be different to what it expects.

The symbol it wants is: (?composite_rays@@YAXIIMVTensor@at@@0000000@Z) whereas the symbol it has is: (?composite_rays@@YAXIIMVTensor@at@@0V12@11000@Z)

I don't know why it's inserting the extra 0V12@.

I've tried some obvious things like playing with the arguments, but since shencoder, freqencoder and gridencoder all work, I'm stuck, as I'm not particularly familiar with the CUDA build process.

I've got CUDA 11.8 installed, but am using 11.6 within the Anaconda/Python environment, and Python is 3.9.13. PyTorch is 1.12.1.

I understand from the README that this has been tested on Linux, so this might be a long shot.

thorikawa commented 1 year ago

I have resolved that issue by removing const from at::Tensor in raymaching.cu.

Here is a diff:

--- a/raymarching/src/raymarching.cu
+++ b/raymarching/src/raymarching.cu
@@ -905,7 +905,7 @@ __global__ void kernel_composite_rays(
 }

-void composite_rays(const uint32_t n_alive, const uint32_t n_step, const float T_thresh, at::Tensor rays_alive, at::Tensor rays_t, const at::Tensor sigmas, const at::Tensor rgbs, const at::Tensor deltas, at::Tensor weights, at::Tensor depth, at::Tensor image) {
+void composite_rays(const uint32_t n_alive, const uint32_t n_step, const float T_thresh, at::Tensor rays_alive, at::Tensor rays_t, at::Tensor sigmas, at::Tensor rgbs, at::Tensor deltas, at::Tensor weights, at::Tensor depth, at::Tensor image) {

nickludlam commented 1 year ago

@thorikawa Amazing, that's exactly the issue! Thank you so much. I guess this patch should be applied to the main repo?

ashawkey commented 1 year ago

Thanks! I haven't tested on Windows yet. Strangely this bug can run on ubuntu... I will fix it soon.

DuckersMcQuack commented 1 year ago

--- a/raymarching/src/raymarching.cu

The file didn't have that line of code for me sadly, so nothing to replace, and no idea where to put the code :P

ashawkey / stable-dreamfusion

Linker issue when building the raymarching extensions #17