API for optimized access order

1uc commented 1 year ago

This MR proposes an API which will enable us to optimize morphology loading for certain reorderable loops. The target are loops which know upfront the names of the morphologies they need to load, and don't require them to be loaded in a particular order.

The types of optimization this unlocks are:

Reordering the loop to reduces large strides in the file.
Loading morphologies in (small) batches.

1uc commented 1 year ago

Back to draft, because the internals can be simplified.

1uc commented 1 year ago

This PR now also includes an optimized version for merged containers.

1uc commented 1 year ago

Should we allow random access? This would matter in loops which look like this:

std::vector<std::string> morphology_names = ...;
auto load_unordered = collection.load_unordered(morphology_names);

// Add some mechanism which doesn't guarantee the order in which
// chunks are accessed. For example
#pragma omp parallel for
for(size_t chunk = 0; chunk < n_chunks; ++chunk) {
    size_t offset = chunk*chunk_size;
    process_chunk(chunk_size, load_unordered.begin()+offset);
}

void process_chunk(size_t chunk_size, LoadUnordered<Morphology>::Iterator it) {
    for(size_t i = 0; i < chunk_size; ++i) {
        auto [k, morph] = *(it++);
        // computations  
    }
}

But technically the above doesn't traverse the morphologies in the optimal order. However, if we make the iterator shared, then we have more thread-safety concerns.

What would be more useful in TD would be access to the internal argsort:

auto morphology_names = ...;
auto collection = ...;

auto access_order = morphio::argsort(collection, morphology_names);
for(size_t i : access_order) {
    auto morph_name = morphology_names[i];
    process(morph_name);
}

matz-e commented 1 year ago

Seams reasonable? Although, for "user-friendliness", maybe just sorting the morphology names without a round-robin trip through indices on the user side would be nicer?

1uc commented 1 year ago

Sorting the morphology names directly loses valuable information, e.g. it prevents one from looking up auxiliary information if things are stored in vectors of equal size, with the convention that index i can be used to retrieve data for that morphology, e.g. morphology_names[i], metadata[i], etc.

More concretely in TD we have the case that we store stuff in a big std::vector<MetaData> then we need to fish stuff out of it by passing the indices of the morphologies we want to load, which then grabs the morphology name and uses (a wrapper of) morphio::Collection::load to load the morphology. I don't see us injecting the iterator into that setup easily, because the step of selecting (which requires the optimized order) and loading the morphology are separated by several function calls in TD. However, they always happen together with the iterator approach. Note that knowing only the order of the names of the morphologies to be loaded is insufficient (or inefficient) in TD.

BlueBrain / MorphIO

API for optimized access order #453