Closed ookey closed 1 year ago
same error here with a custom dataset. The fox example did work on my setup though.
Update- For the custom dataset, it worked for me after rebuilding and regenerating the dataset (with colomap)
same problem here (with the fox example) running on GForce 1060 (is it possible?)
Any solution?
Any solution?
Ok, I managed to solve it. The problem on my end happened because I built ngp inside conda env, and since conda env has its own separate version of Cuda (because of pytorch), ngp will try to use it. The solution is to remove all conda env Cuda paths from $PATH before building ngp, and thus ngp will use original OS version of Cuda.
I don't have conda on my system and I'm still having this problem...
Same issue, it might be related to the aabb_scale in transform.json, this error happened if set aabb_scale to 8 or bigger.
Could you try again with the latest code from master
/ latest binaries?
Hello, It's working! Since I slightly changed my config, here's the way I successfully tested today.
$ git log --oneline -n 1
a0090e4 (HEAD, origin/master, origin/HEAD) NeRF: fix broken training on some scenes
$ nvidia-smi
Mon Jan 16 10:03:19 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:15:00.0 On | N/A |
| 30% 30C P8 31W / 350W | 877MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
$ TCNN_CUDA_ARCHITECTURES=86 cmake -B allbuilds/lin-rel-86-cu118/ .
$ cmake --build allbuilds/lin-rel-86-cu118/ -j
$ cat configs/nerf/test.json
{
"parent": "small.json",
"network": {
"n_output_dims": 17
}
}
$ ./allbuilds/lin-rel-86-cu118/instant-ngp --scene /data/captures/sony/renaud-stephane/ --config configs/nerf/test.json
closing it
I tried nearly everything here to fix this bug when I ran it on my Nvidia gpu, and maybe I didn't implement them correctly, but if someone finds this thread in the future the fix I used that solved it was I went into the header file cutlass_matmul.h (its in dependencies and header files folders) and commented out line 332 where it called the error thrower and now it runs consistently, I guess whatever function is erroring out isn't actually used in the product and can probably be ignored, but just in case use this as a last resort.
I tried nearly everything here to fix this bug when I ran it on my Nvidia gpu, and maybe I didn't implement them correctly, but if someone finds this thread in the future the fix I used that solved it was I went into the header file cutlass_matmul.h (its in dependencies and header files folders) and commented out line 332 where it called the error thrower and now it runs consistently, I guess whatever function is erroring out isn't actually used in the product and can probably be ignored, but just in case use this as a last resort.
I also commented out the line (now it's 330) and the problem was solved. Thanks, leon! Is there a more proper way to do this?
same issue here
@ookey what changes did you make in your config?
Hello, thanks for your valuable work. Setting density_network's
n_output_dims
to 17 or above I get the following error message:My config:
89fe416
My config.json:
I understood that this 17 limit switches on the use of cutlass_matmul's
fc_multiply
though I can't figure out why it's failing.Here is the call stack: