gvdbCalcExtraBrickId allocates insufficient memory to buffer?

icoderaven commented 5 years ago

Hi

I noticed that my application would fail to rebuild the topology sometimes depending on the extent of the reprojected points from my 640x480 depth images. On further investigation, I noticed that in ActivateExtraBricksGPU() https://github.com/NVIDIA/gvdb-voxels/blob/405a58089521920680cf11ff09f970617d682863/source/gvdb_library/src/gvdb_volume_gvdb.cpp#L1055 the auxilliary data buffer AUX_BRICK_LEVXYZ for the brickIds doesn't cover the fact that the kernel method gvdbCalcExtraBrickId() can try to add much more brickIds than numPts*numLevels, depending on the passed radius parameter. This causes my application to crash with memory access errors. https://github.com/NVIDIA/gvdb-voxels/blob/405a58089521920680cf11ff09f970617d682863/source/gvdb_library/kernels/cuda_gvdb_particles.cuh#L553

From what I can make out the process is to first figure out all the brickIds that don't have an allocated parent, store them in this buffer, then find the unique brickIds of the lot, and finally allocate these bricks. This choice seems to stem from the fact that it is slow to search for uniqueness and then adding those ids within one kernel.

Two questions: a. How exactly does the radius parameter affect the choice of brickIds? It's slightly unclear what the recommended value should be (The sample code uses a value of 2.0, so does that mean a neighbourhood of 2 spanning bricks in each level of the hierarchy?) b. Is the correct solution to change the buffer size allocated to pNumPnts pRootLev pow(2radius/range_level, 3) 4? That seems like a lot.... Adding a check for trying to atomicAdd more elements than the buffer permits seems suboptimal. At that point it would seem like doing some sort of shared memory based updating of unique brickIds makes more sense?

nathanchrs commented 5 years ago

Hi @icoderaven,

From reading the gvdbCalc(Incre)ExtraBrickId code in cuda_gvdb_particles.cuh, it seems that the radius parameter's use is for allocating extra bricks which do not have any points inside them, but is within radius distance from at least one point.

For small radius values (< brick width), the maximum number of bricks marked will probably be around 8 num_pnts, since the radius around each point may potentially intersect up to 8 bricks if located at the corner of a brick. We will also have to consider the number of nodes in the higher levels too, so a safe estimate might be around 10 num_pnts elements in the AUX_BRICK_LEVXYZ buffer (which makes it PrepareAux ( AUX_BRICK_LEVXYZ, 10 * pNumPnts * pRootLev * 4, sizeof(unsigned short), true );.

In reality, the actual size of the AUX_BRICK_LEVXYZ buffer should be smaller than the ideal maximum limit though; it really depends on the distribution of your points and the radius.

BTW if anyone is interested in learning more about the dynamic topology rebuild process, it is explained in detail in the paper 'Fast Fluid Simulations with Sparse Volumes on the GPU' (DOI: 10.1111/cgf.13350)), of which @ramakarl is one of the authors. It is also available here (PDF).

icoderaven commented 5 years ago

Thanks for the PDF link. Looks useful - surprised I didn't find it earlier. I think some of my original questions still stand, however.

icoderaven commented 4 years ago

Addressed in #79

NVIDIA / gvdb-voxels

gvdbCalcExtraBrickId allocates insufficient memory to buffer? #65