Closed 1uc closed 10 months ago
When reading edge ids for a large file we get a 10'000x speedup. The benchmark is:
edge_filename = "/gpfs/bbp.cscs.ch/data/scratch/proj134/matwolf/v4_all_pathways/edges_try2.h5"
edge_file = libsonata.EdgeStorage(edge_filename)
population_name = "root__neurons__root__neurons__chemical"
population = edge_file.open_population(population_name)
n_nodes = 100
n_gids = 10000000
n_ranks = n_nodes * 40
gid_stride = 2 # simulates some selection effect.
all_edge_ids = []
for k_rank in range(n_ranks):
gids = np.arange(*fair_chunk(n_ranks, k_rank, n_gids), gid_stride)
edge_ids = population.afferent_edges(gids)
all_edge_ids.append(edge_ids.ranges)
This would compute the edge IDs used by each MPI rank for analysis purposes. With the optimization it takes about 2-4s without the first 10 rank take 77s (or 8.5h total).
The PR uses templates to hide the difference between Selection::Ranges
which is an std::vector<std::pair>
and the RawIndex
which an std::vector<std::array>
. We need to use std::get<.>
to access x.first
and x[0]
in a uniform manner.
The solution only works for afferent_edges
but not for efferent_edges
, due to locality.
Optimized reading of edge IDs by aggregating ranges into larger (GPFS-friendly) ranges before creating the appropriate HDF5 selection to reduce the number of individual reads. Then it filters out any unneeded data in memory. This is very similar to work done in #183.
This PR introduces the following:
libsonata.Selection
.?fferent_edge
in bulk.SONATA_PAGESIZE
which controls how large the merged region need to be.