Optimize loading `?fferent_edges`.

1uc commented 10 months ago

Optimized reading of edge IDs by aggregating ranges into larger (GPFS-friendly) ranges before creating the appropriate HDF5 selection to reduce the number of individual reads. Then it filters out any unneeded data in memory. This is very similar to work done in #183.

This PR introduces the following:

Extend the internal API for merging libsonata.Selection.
Add internal API for reading in bulk and filtering.
Load ?fferent_edge in bulk.
Compile-time constant SONATA_PAGESIZE which controls how large the merged region need to be.

1uc commented 10 months ago

When reading edge ids for a large file we get a 10'000x speedup. The benchmark is:

edge_filename = "/gpfs/bbp.cscs.ch/data/scratch/proj134/matwolf/v4_all_pathways/edges_try2.h5"
edge_file = libsonata.EdgeStorage(edge_filename)
population_name = "root__neurons__root__neurons__chemical"
population = edge_file.open_population(population_name)

n_nodes = 100
n_gids = 10000000
n_ranks = n_nodes * 40
gid_stride = 2 # simulates some selection effect.

all_edge_ids = []
for k_rank in range(n_ranks):
    gids = np.arange(*fair_chunk(n_ranks, k_rank, n_gids), gid_stride)
    edge_ids = population.afferent_edges(gids)
    all_edge_ids.append(edge_ids.ranges)

This would compute the edge IDs used by each MPI rank for analysis purposes. With the optimization it takes about 2-4s without the first 10 rank take 77s (or 8.5h total).

1uc commented 10 months ago

The PR uses templates to hide the difference between Selection::Ranges which is an std::vector<std::pair> and the RawIndex which an std::vector<std::array>. We need to use std::get<.> to access x.first and x[0] in a uniform manner.

1uc commented 10 months ago

The solution only works for afferent_edges but not for efferent_edges, due to locality.

BlueBrain / libsonata

Optimize loading `?fferent_edges`. #298