NVlabs / instant-ngp

Instant neural graphics primitives: lightning fast NeRF and more
https://nvlabs.github.io/instant-ngp
Other
16k stars 1.93k forks source link

Is hash necessary for proposed method? #97

Closed Jiang-Stan closed 2 years ago

Jiang-Stan commented 2 years ago

Hi, thanks for the great work! While reading the paper, I was impressed by using hash as an efficient storage-saving method. But I'm also curious if it's possible to implement storage multiplexing with remainders instead of hash tables, since hashing is not a very hardware friendly method, and remainders can regarded as a special hash result. So I tested the performance on some NeRF dataset after removing fast_hash implementation from the code, as is shown in this figure: image

and I found that the performance is still impressive without using hash, as shown in following tables:

On NeRF_synthetic: data steps=1000 w/ hash psnr steps=1000 w/o hash psnr steps=10000 w/ hash psnr steps=10000 w/o hash psnr steps=50000 w/ hash psnr steps=50000 w/o hash psnr
lego 32.208 32.602 35.478 35.757 36.511 36.621
chair 31.476 31.255 34.129 33.490 34.994 34.385
drums 25.170 25.253 26.891 26.893 27.612 27.487
ficus 29.611 29.923 31.075 31.184 31.522 31.580
hotdog 34.856 34.827 37.524 37.246 38.328 37.957
materials 28.191 28.199 31.074 31.024 32.100 31.980
mic 32.968 32.730 36.180 35.701 37.214 36.759
ship 27.822 27.861 30.813 30.508 31.743 31.545
On Tartanair: data w/ hash psnr w/o hash psnr
abandonedfactory 17.278 17.438
amusement 20.346 20.398
carwelding 23.781 23.096
hospital 28.752 28.121
office2 26.315 26.531
seasonforest 19.134 18.172

I don't know if you have noticed this phenomenon before, because I didn't see hash related ablation experiments in the paper. Or is there anything I didn't noticed before in code that implemented hash?

If there is nothing wrong with code modification, is it possible to design a storage space representation to achieve a hardware-friendly mapping method that can achieve similar performance as hash did?

Thanks in advance for your reply!

Tom94 commented 2 years ago

Hi there, thanks a lot for running these tests! You can actually use the TiledGrid mode of the encoding to obtain the "remainder" approach without any code changes.

We did experiment extensively with this, including hybrids, where the lower bits were tiled and the higher bits were hashed.

The key observations were:

Tom94 commented 2 years ago

That said, you're 100% correct in that this discussion is missing from the paper. We really should add all this additional information for an updated version of it!

Jiang-Stan commented 2 years ago

Got it. Thanks a lot for your reply!

ashawkey commented 2 years ago

Hi, I wonder why TiledGrid mode limit each level's params by base_resolution instead of resolution here? This is different from simply commenting out the fash_hash part. Currently, TiledGrid mode uses much fewer parameters (total_encoding_params=131072) compared to HashGrid (total_encoding_params=12599920).

Tom94 commented 2 years ago

In Tiled mode, it's good to be able to directly control the resolution of the tiled region, which is taken case of by base_resolution. If each level's parameters were controlled by resolution, then that'd be DenseGrid mode.

Also -- double-checking the code -- the tiling computation seems to be skewed at the moment due to a missing modulo by base_resolution in the inner loop of grid_index; my bad. I remember removing such a modulo operation due to it negatively affecting the dense & hash modes. Seems like I was overeager in trying to avoid code duplication.

ashawkey commented 2 years ago

Thanks for the explaining!