Closed Hopperpop closed 2 years ago
Hi there,
you might be able to further squeeze down the memory usage by reducing the resolution --width 1280 --height 720
, but I'm unsure this will be enough.
Regarding atomicAdd(__half2)
: I'm surprised actually. How does this error manifest? I'd like to make this codebase work on as wide a range of GPUs as possible and both the CUDA documentation and CI suggest it should work on compute capability 61.
When building I get:
D:***\include\neural-graphics-primitives/common_device.cuh(127): error : no instance of overloaded function "atomicAdd" matches the argument list [D:\***\build\ngp.vcxproj]
argument types are: (__half2 *, {...})
detected during instantiation of "void ngp::deposit_image_gradient(const Eigen::Matrix<float, N_DIMS, 1, <expression>, N_DIMS, 1> &, T *, T *, const Eigen::Vector2i &, const
Eigen::Vector2f &) [with N_DIMS=2U, T=float]"
D:\***\src\testbed_nerf.cu(1512): here
D:\***\include\neural-graphics-primitives/common_device.cuh(128): error : no instance of overloaded function "atomicAdd" matches the argument list [D:***\build\ngp.vcxproj]
argument types are: (__half2 *, {...})
detected during instantiation of "void ngp::deposit_image_gradient(const Eigen::Matrix<float, N_DIMS, 1, <expression>, N_DIMS, 1> &, T *, T *, const Eigen::Vector2i &, const
Eigen::Vector2f &) [with N_DIMS=2U, T=float]"
D:\***\src\testbed_nerf.cu(1512): here
Maybe it's a wrong dependency problem, instead of the hardware not supporting it: Win10 Cuda compilation tools, release 11.6, V11.6.55 Build cuda_11.6.r11.6/compiler.30794723_0 cmake version 3.23.0-rc1 Python 3.8.10
Running following command still gives me the same error:
./build/testbed.exe --scene data/nerf/fox --width 10 --height 10
But reducing the amount of photo's to 20, makes it possible to run it.
I just got the same error, running on a GTX 1080. The fox does work for me, but when I try to run a dataset I prepared myself it gives this error. That was a very large dataset though, so I just tried shrinking it down and it still gives the same error. (smaller than the fox in MB's at this point.)
edit: I forgot to add, that adding --width 10 --height 10
also does nothing for me.
I'm also getting the same error regarding atomicAdd (Running on a GTX 1080 TI)
What are the VRAM requirements for the provided examples after all?
The VRAM requirements vary with architecture with older GPUs unfortunately requiring more RAM due to needing fp32 for efficiency and not being able to run fully fused neural networks.
In general, it seems that 8 GB are enough to run fox
in all cases -- so only a little push would be needed to make it fit into OP's 6 GB card.
Hi,
Thanks for sharing this great work. I'm trying to run the samples on smaller gpu: GTX1060 6Gb. The "Einstein" examples runs fine, but when I run the fox example I get:
Is it still possible to run this example with some modified parameters for gpu's with lower memory, or should I give up?
Small note: atomicAdd(__half2) is also not supported on my architecture (=61). I needed to disable it in "common_device.cuh".