DESI-UR / VAST

Void Analysis Software Toolkit
https://vast.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
9 stars 8 forks source link

Enhancement: Split hash table into two segments to reduce locked access in mask_mode=='periodic' #69

Closed QuiteAFoxtrot closed 2 years ago

QuiteAFoxtrot commented 2 years ago

https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.99.2698&rep=rep1&type=pdf

In mask_mode=='periodic' the vast majority of cell lookups should come from the actual source of the survey - so we could leave this as its current shared memory implementation and treat it as static (never needs to resize) meaning that its read-only aka no synchronization/Locking is needed for inter process communication - then we could have a 2nd hash table just for the cells which are part of the exterior of the survey - since we require the galaxy_map_grid_edge_length to be aligned with the xyz_limits such that it produces a perfect integer number of cells, we can easily use those cell integers to determine if a query on an (i,j,k) cell belongs to the original/source survey or the exterior, and then perform the lookup appropriately in the original static table or in the synchronized/locked secondary table

QuiteAFoxtrot commented 2 years ago

Implemented this, way faster than the blocking version using a single hash table - could still be further possible improvements perhaps, I implemented the following logic:

in mask_mode==2 ("periodic") --> check if cell_ID is inside original/source data ----> if inside, allow unlocked access to original 'static' hash map and arrays ----> if outside, utilize locked access to a secondary resizable hash map and 'appended' rows at the end of the galaxy coordinate and lookup arrays

I believe the check for cell ID in the "source" data or the 'infinite exterior' should be very fast since it amounts to 6 integer comparisons: 0 <= i < i_max 0 <= j < j_max 0 <= k < k_max

Not implementing the N hash tables as described by the paper above at this time since in the vast majority of cases (with the possible exception of intentional test cases) I imagine we will only be growing spheres within the original/source survey