ashawkey / torch-ngp

A pytorch CUDA extension implementation of instant-ngp (sdf and nerf), with a GUI.
MIT License
2.11k stars 275 forks source link

Compilation issue - RuntimeError: Error building extension '_hash_encoder' #4

Closed wangjksjtu closed 2 years ago

wangjksjtu commented 2 years ago

Thanks for the nice work! I met the following issue when I run python train_nerf.py data/fox --workspace trial_nerf. Do you have any thoughts? Many thanks for your help!

Traceback (most recent call last):
  File "/home/wangjk/anaconda3/envs/largesteps/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1723, in _run_ninja_build
    env=env)
  File "/home/wangjk/anaconda3/envs/largesteps/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "train_nerf.py", line 3, in <module>
    from nerf.network import NeRFNetwork
  File "/home/wangjk/programs/torch-ngp/nerf/network.py", line 9, in <module>
    from encoding import get_encoder
  File "/home/wangjk/programs/torch-ngp/encoding.py", line 6, in <module>
    from hashencoder import HashEncoder
  File "/home/wangjk/programs/torch-ngp/hashencoder/__init__.py", line 1, in <module>
    from .hashgrid import HashEncoder
  File "/home/wangjk/programs/torch-ngp/hashencoder/hashgrid.py", line 8, in <module>
    from .backend import _backend
  File "/home/wangjk/programs/torch-ngp/hashencoder/backend.py", line 12, in <module>
    sources=[os.path.join(_src_path, 'src', f) for f in [
  File "/home/wangjk/anaconda3/envs/largesteps/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1136, in load
    keep_intermediates=keep_intermediates)
  File "/home/wangjk/anaconda3/envs/largesteps/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1347, in _jit_compile
    is_standalone=is_standalone)
  File "/home/wangjk/anaconda3/envs/largesteps/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1452, in _write_ninja_file_and_build_library
    error_prefix=f"Error building extension '{name}'")
  File "/home/wangjk/anaconda3/envs/largesteps/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1733, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension '_hash_encoder': [1/2] /home/wangjk/anaconda3/envs/largesteps/bin/nvcc  -DTORCH_EXTENSION_NAME=_hash_encoder -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/wangjk/anaconda3/envs/largesteps/lib/python3.7/site-packages/torch/include -isystem /home/wangjk/anaconda3/envs/largesteps/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/wangjk/anaconda3/envs/largesteps/lib/python3.7/site-packages/torch/include/TH -isystem /home/wangjk/anaconda3/envs/largesteps/lib/python3.7/site-packages/torch/include/THC -isystem /home/wangjk/anaconda3/envs/largesteps/include -isystem /home/wangjk/anaconda3/envs/largesteps/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -std=c++14 -c /home/wangjk/programs/torch-ngp/hashencoder/src/hashencoder.cu -o hashencoder.cuda.o 
FAILED: hashencoder.cuda.o 
/home/wangjk/anaconda3/envs/largesteps/bin/nvcc  -DTORCH_EXTENSION_NAME=_hash_encoder -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/wangjk/anaconda3/envs/largesteps/lib/python3.7/site-packages/torch/include -isystem /home/wangjk/anaconda3/envs/largesteps/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/wangjk/anaconda3/envs/largesteps/lib/python3.7/site-packages/torch/include/TH -isystem /home/wangjk/anaconda3/envs/largesteps/lib/python3.7/site-packages/torch/include/THC -isystem /home/wangjk/anaconda3/envs/largesteps/include -isystem /home/wangjk/anaconda3/envs/largesteps/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -std=c++14 -c /home/wangjk/programs/torch-ngp/hashencoder/src/hashencoder.cu -o hashencoder.cuda.o 
/home/wangjk/programs/torch-ngp/hashencoder/src/hashencoder.cu(26): error: no instance of overloaded function "atomicAdd" matches the argument list
            argument types are: (__half *, c10::Half)

1 error detected in the compilation of "/home/wangjk/programs/torch-ngp/hashencoder/src/hashencoder.cu".
ninja: build stopped: subcommand failed.

More info:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Wed_Jun__2_19:15:15_PDT_2021
Cuda compilation tools, release 11.4, V11.4.48
Build cuda_11.4.r11.4/compiler.30033411_0
>>> import torch
>>> torch.version.cuda
'11.3'
>>> torch.__version__
'1.10.0'
ashawkey commented 2 years ago

What's your GPU hardware architecture? Currently the code uses atomicAdd for __half, which is only available for a GPU with architecture >= 70. A temporary solution is to comment out that function here and its use here, and make sure level_dim is even (but a minimal architecture of 60 is still needed for __half2).

aoliao12138 commented 2 years ago

I met a similar error.

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/cpp_extension.py", line 1717, in _run_ninja_build
    subprocess.run(
  File "/usr/lib/python3.8/subprocess.py", line 512, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "train_nerf.py", line 3, in <module>
    from nerf.network import NeRFNetwork
  File "/data/new_disk70/wangla/tmp/torch-ngp/nerf/network.py", line 9, in <module>
    from encoding import get_encoder
  File "/data/new_disk70/wangla/tmp/torch-ngp/encoding.py", line 6, in <module>
    from hashencoder import HashEncoder
  File "/data/new_disk70/wangla/tmp/torch-ngp/hashencoder/__init__.py", line 1, in <module>
    from .hashgrid import HashEncoder
  File "/data/new_disk70/wangla/tmp/torch-ngp/hashencoder/hashgrid.py", line 8, in <module>
    from .backend import _backend
  File "/data/new_disk70/wangla/tmp/torch-ngp/hashencoder/backend.py", line 6, in <module>
    _backend = load(name='_hash_encoder',
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/cpp_extension.py", line 1124, in load
    return _jit_compile(
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/cpp_extension.py", line 1337, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/cpp_extension.py", line 1449, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/cpp_extension.py", line 1733, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension '_hash_encoder': [1/3] :/usr/local/cuda-11.3/bin/nvcc  -DTORCH_EXTENSION_NAME=_hash_encoder -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /usr/local/lib/python3.8/dist-packages/torch/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.8/dist-packages/torch/include/THC -isystem :/usr/local/cuda-11.3/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -std=c++14 -c /data/new_disk70/wangla/tmp/torch-ngp/hashencoder/src/hashencoder.cu -o hashencoder.cuda.o 
FAILED: hashencoder.cuda.o 
:/usr/local/cuda-11.3/bin/nvcc  -DTORCH_EXTENSION_NAME=_hash_encoder -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /usr/local/lib/python3.8/dist-packages/torch/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.8/dist-packages/torch/include/THC -isystem :/usr/local/cuda-11.3/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -std=c++14 -c /data/new_disk70/wangla/tmp/torch-ngp/hashencoder/src/hashencoder.cu -o hashencoder.cuda.o 
/bin/sh: 1: :/usr/local/cuda-11.3/bin/nvcc: not found
[2/3] c++ -MMD -MF bindings.o.d -DTORCH_EXTENSION_NAME=_hash_encoder -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /usr/local/lib/python3.8/dist-packages/torch/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.8/dist-packages/torch/include/THC -isystem :/usr/local/cuda-11.3/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -c /data/new_disk70/wangla/tmp/torch-ngp/hashencoder/src/bindings.cpp -o bindings.o 
ninja: build stopped: subcommand failed.

even l comment out that 2 lines, still the same error occurs.

More info:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Mar_21_19:15:46_PDT_2021
Cuda compilation tools, release 11.3, V11.3.58
Build cuda_11.3.r11.3/compiler.29745058_0
Python 3.8.5 (default, Jul 28 2020, 12:59:40) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'1.10.2+cu113'

I am using RTX3090.

ashawkey commented 2 years ago

@aoliao12138 The error message says /bin/sh: 1: :/usr/local/cuda-11.3/bin/nvcc: not found, have you included CUDA bin to your path? (e.g., export PATH="/usr/local/cuda/bin:$PATH")

wangjksjtu commented 2 years ago

@ashawkey Thank you for the prompt reply! My GPU is RTX 1080Ti - so the architecture is 61. It seems to work for me when comment that atomicAdd function. However, the following issues (compilation of fully fused network) appear:

  File "train_nerf.py", line 4, in <module>
    from nerf.network_ff import NeRFNetwork as NeRFNetwork_FF
  File "/home/wangjk/programs/torch-ngp/nerf/network_ff.py", line 10, in <module>
    from ffmlp import FFMLP
  File "/home/wangjk/programs/torch-ngp/ffmlp/__init__.py", line 1, in <module>
    from .ffmlp import FFMLP
  File "/home/wangjk/programs/torch-ngp/ffmlp/ffmlp.py", line 10, in <module>
    from .backend import _backend
  File "/home/wangjk/programs/torch-ngp/ffmlp/backend.py", line 16, in <module>
    sources=[os.path.join(_src_path, 'src', f) for f in [
  File "/home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1136, in load
    keep_intermediates=keep_intermediates)
  File "/home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1347, in _jit_compile
    is_standalone=is_standalone)
  File "/home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1452, in _write_ninja_file_and_build_library
    error_prefix=f"Error building extension '{name}'")
  File "/home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1733, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension '_ffmlp': [1/2] /home/wangjk/anaconda3/envs/torch-ngp/bin/nvcc  -DTORCH_EXTENSION_NAME=_ffmlp -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/wangjk/programs/torch-ngp/ffmlp/include -isystem /home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/include -isystem /home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/include/TH -isystem /home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/include/THC -isystem /home/wangjk/anaconda3/envs/torch-ngp/include -isystem /home/wangjk/anaconda3/envs/torch-ngp/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -std=c++14 -c /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu -o ffmlp.cuda.o 
FAILED: ffmlp.cuda.o 
/home/wangjk/anaconda3/envs/torch-ngp/bin/nvcc  -DTORCH_EXTENSION_NAME=_ffmlp -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/wangjk/programs/torch-ngp/ffmlp/include -isystem /home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/include -isystem /home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/include/TH -isystem /home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/include/THC -isystem /home/wangjk/anaconda3/envs/torch-ngp/include -isystem /home/wangjk/anaconda3/envs/torch-ngp/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -std=c++14 -c /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu -o ffmlp.cuda.o 
/home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(243): error: explicit type is missing ("int" assumed)

/home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(243): error: expected a ")"

/home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(344): error: explicit type is missing ("int" assumed)

/home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(344): error: expected a ")"

/home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(577): error: name followed by "::" must be a class or namespace name

/home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(396): error: identifier "output_layout" is undefined
          detected during instantiation of "void ffmlp_forward_cuda<WIDTH,INFERENCE>(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" 
(655): here

/home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(396): error: name followed by "::" must be a class or namespace name
          detected during instantiation of "void ffmlp_forward_cuda<WIDTH,INFERENCE>(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" 
(655): here

/home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(60): error: name must be a namespace name
          detected during:
            instantiation of "void kernel_mlp_fused<WIDTH,BLOCK_DIM_Z,N_ITERS,OUT_T,INFERENCE>(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" 
(564): here
            instantiation of "void ffmlp_forward_cuda<WIDTH,INFERENCE>(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" 
(655): here

/home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(64): error: identifier "wmma" is undefined
          detected during:
            instantiation of "void kernel_mlp_fused<WIDTH,BLOCK_DIM_Z,N_ITERS,OUT_T,INFERENCE>(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" 
(564): here
            instantiation of "void ffmlp_forward_cuda<WIDTH,INFERENCE>(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" 
(655): here
....
....
85 errors detected in the compilation of "/home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu".
ninja: build stopped: subcommand failed.

Full log here:

Click to expand! ``` Traceback (most recent call last): File "train_nerf.py", line 4, in from nerf.network_ff import NeRFNetwork as NeRFNetwork_FF File "/home/wangjk/programs/torch-ngp/nerf/network_ff.py", line 10, in from ffmlp import FFMLP File "/home/wangjk/programs/torch-ngp/ffmlp/__init__.py", line 1, in from .ffmlp import FFMLP File "/home/wangjk/programs/torch-ngp/ffmlp/ffmlp.py", line 10, in from .backend import _backend File "/home/wangjk/programs/torch-ngp/ffmlp/backend.py", line 16, in sources=[os.path.join(_src_path, 'src', f) for f in [ File "/home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1136, in load keep_intermediates=keep_intermediates) File "/home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1347, in _jit_compile is_standalone=is_standalone) File "/home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1452, in _write_ninja_file_and_build_library error_prefix=f"Error building extension '{name}'") File "/home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1733, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension '_ffmlp': [1/2] /home/wangjk/anaconda3/envs/torch-ngp/bin/nvcc -DTORCH_EXTENSION_NAME=_ffmlp -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/wangjk/programs/torch-ngp/ffmlp/include -isystem /home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/include -isystem /home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/include/TH -isystem /home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/include/THC -isystem /home/wangjk/anaconda3/envs/torch-ngp/include -isystem /home/wangjk/anaconda3/envs/torch-ngp/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -std=c++14 -c /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu -o ffmlp.cuda.o FAILED: ffmlp.cuda.o /home/wangjk/anaconda3/envs/torch-ngp/bin/nvcc -DTORCH_EXTENSION_NAME=_ffmlp -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/wangjk/programs/torch-ngp/ffmlp/include -isystem /home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/include -isystem /home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/include/TH -isystem /home/wangjk/anaconda3/envs/torch-ngp/lib/python3.7/site-packages/torch/include/THC -isystem /home/wangjk/anaconda3/envs/torch-ngp/include -isystem /home/wangjk/anaconda3/envs/torch-ngp/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -std=c++14 -c /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu -o ffmlp.cuda.o /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(243): error: explicit type is missing ("int" assumed) /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(243): error: expected a ")" /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(344): error: explicit type is missing ("int" assumed) /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(344): error: expected a ")" /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(577): error: name followed by "::" must be a class or namespace name /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(396): error: identifier "output_layout" is undefined detected during instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(396): error: name followed by "::" must be a class or namespace name detected during instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(60): error: name must be a namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(64): error: identifier "wmma" is undefined detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(64): error: too few arguments for alias template "std::conditional_t" detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(64): error: expected a ";" detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(67): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(67): error: type name is not allowed detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(67): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(67): error: identifier "act_frag" is undefined detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(68): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(68): error: type name is not allowed detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(68): error: type name is not allowed detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(68): error: identifier "weights_frag" is undefined detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(69): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(69): error: type name is not allowed detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(69): error: identifier "result_frag" is undefined detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(88): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(90): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(96): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(101): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(102): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(108): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(119): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(119): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(165): error: name must be a namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(168): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(168): error: type name is not allowed detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(168): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(168): error: identifier "act_frag" is undefined detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(169): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(169): error: type name is not allowed detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(169): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(169): error: identifier "weights_frag" is undefined detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(170): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(170): error: type name is not allowed detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(170): error: identifier "result_frag" is undefined detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(213): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(217): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(218): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(219): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(229): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(229): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(252): error: name must be a namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(255): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(255): error: type name is not allowed detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(255): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(255): error: identifier "act_frag" is undefined detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(256): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(256): error: type name is not allowed detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(256): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(256): error: identifier "weights_frag" is undefined detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(257): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(257): error: type name is not allowed detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(257): error: identifier "result_frag" is undefined detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(280): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(284): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(289): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(290): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(295): error: identifier "output_layout" is undefined detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(295): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(296): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(299): error: name followed by "::" must be a class or namespace name detected during: instantiation of "void kernel_mlp_fused(Activation, Activation, const __half *, const __half *, OUT_T *, OUT_T *, uint32_t, uint32_t, uint32_t, uint32_t, int) [with WIDTH=16, BLOCK_DIM_Z=1, N_ITERS=8, OUT_T=__half, INFERENCE=false]" (564): here instantiation of "void ffmlp_forward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, Activation, __half *, __half *) [with WIDTH=16U, INFERENCE=false]" (655): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(616): error: name followed by "::" must be a class or namespace name /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(618): error: name followed by "::" must be a class or namespace name /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(618): error: expected an identifier /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(618): error: "threads" has already been declared in the current scope /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(618): error: "shmem_size" has already been declared in the current scope /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(618): error: expected an identifier /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(619): error: parameter "activation" is not a type name /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(620): error: parameter "grad" is not a type name /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(621): error: variable "weights_second" is not a type name /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(622): error: parameter "backward_buffer" is not a type name /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(623): error: parameter "forward_buffer" is not a type name /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(624): error: parameter "grad_inputs" is not a type name /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(625): error: variable "weights_first" is not a type name /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(626): error: parameter "B" is not a type name /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(627): error: parameter "output_dim" is not a type name /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(628): error: parameter "num_layers" is not a type name /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(628): error: expected a ")" /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(597): warning: variable "weights_first" was declared but never referenced /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(598): warning: variable "weights_second" was declared but never referenced /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(606): warning: variable "threads" was declared but never referenced detected during instantiation of "void ffmlp_backward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, const __half *, __half *, __half *) [with WIDTH=16U]" (832): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(606): warning: variable "threads" was declared but never referenced detected during instantiation of "void ffmlp_backward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, const __half *, __half *, __half *) [with WIDTH=32U]" (833): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(606): warning: variable "threads" was declared but never referenced detected during instantiation of "void ffmlp_backward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, const __half *, __half *, __half *) [with WIDTH=64U]" (834): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(606): warning: variable "threads" was declared but never referenced detected during instantiation of "void ffmlp_backward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, const __half *, __half *, __half *) [with WIDTH=128U]" (835): here /home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu(606): warning: variable "threads" was declared but never referenced detected during instantiation of "void ffmlp_backward_cuda(const __half *, const __half *, uint32_t, uint32_t, uint32_t, uint32_t, Activation, const __half *, __half *, __half *) [with WIDTH=256U]" (836): here 85 errors detected in the compilation of "/home/wangjk/programs/torch-ngp/ffmlp/src/ffmlp.cu". ninja: build stopped: subcommand failed. ```
ashawkey commented 2 years ago

@wangjksjtu ffmlp uses cutlass, which also requires architecture >= 70 (here), maybe you could remove the import and usage in python script to avoid using ffmlp, as it doesn't help very much now.

wangjksjtu commented 2 years ago

yeah, that is what I am doing now! However, I cannot obtain decent performance. Any thoughts? see issue https://github.com/ashawkey/torch-ngp/issues/5

ashawkey commented 2 years ago

@wangjksjtu thanks for spotting the bug, I have fixed it!

aoliao12138 commented 2 years ago

@ashawkey Thanks for your reply! I solved it.

ashawkey commented 2 years ago

Closed for now.