NVIDIA / gvdb-voxels

Sparse volume compute and rendering on NVIDIA GPUs
Other
679 stars 145 forks source link

ComputeKernel and Compute + GVDB_VOX performs redundant (+ inaccurate?) computations #86

Closed icoderaven closed 4 years ago

icoderaven commented 4 years ago

Hi!

So I was investigating the UpdateApron method and while dealing with it I realised that the GVDB_VOX macro used in kernels that run over atlas space bricks, and the call to said kernels doesn't make a lot of sense. GVDB_VOX defines the voxel id as uint3 vox = blockIdx * make_uint3(blockDim.x, blockDim.y, blockDim.z) + threadIdx + make_uint3(1,1,1);

Clearly, the inherent assumption being made here (and further verified by looking at the macros defined just above such as GVDB_COPY_SMEM_UC) is that the apron is 1 unit wide, and the blocks are 8x8x8 threads. However, the ComputeKernel calls are over a grid of all atlas bricks including apron width (getAtlasRes will return N(8+2)) - so assuming that the leaf nodes are 8x8x8 voxels then the grid runs over 10N/8 + 1 in each dimension, where N is the actual number of atlas bricks.

Instead I propose modifying the calls to the Compute kernels to run the grids over the N bricks themselves, and modifying the GVDB_VOX macro such that it only operates on the 8x8x8 actual data voxels of the brick and leave the apron voxels alone. They should be updated by the UpdateApron method calls anyway.

Of course the macro doesn't work for a different GVDB level -1 configuration, or for apron size other than 1, but it doesn't seem like the other macros will either.

Example in this commit https://github.com/icoderaven/gvdb-voxels/commit/2adf6c47f4ca9345b13ca6682970d74fe3cedc6c

NBickford-NV commented 4 years ago

Hi icoderaven,

I've put a bit of work up on modifying these kernels and callers to skip over apron voxels on the new packed-compute branch, if you want to take an early look at it and see if it's compatible with your workflow! I suspect there are probably a few bugs in it at the moment, though.

This adds a new parameter to Compute and ComputeKernel, which is an API-breaking change. The reason for this is so that developers writing kernels passed to ComputeKernel know that they may need to switch to using the smaller grid size, or to specify that they're using the larger grid size (iterating over the entire atlas). It should be compatible with any tree whose bricks have dimensions divisible by 8, and is intended to work with different apron sizes, should apron sizes other than 1 be enabled in the future.

Hope this helps!

NBickford-NV commented 4 years ago

Hi icoderaven,

I just fixed the bugs I found in the packed-compute branch and verified that the samples work - if it looks good with you, I'll go ahead and merge it into master. Note that for the filling and resampling kernels, we preserve their previous behavior (of also modifying atlas voxels), so that people will have to change less code downstream (I hope).

Thanks!

icoderaven commented 4 years ago

This looks great! Apologies, was working on a deadline until today. I'll have a proper look at this over the weekend, but a quick scan seems to look good! Thanks!

NBickford-NV commented 4 years ago

Merged in https://github.com/NVIDIA/gvdb-voxels/commit/4ea8fc3b791263d306f3004bc9b533eb37780874!