single-cell-data / SOMA

A flexible and extensible API for annotated 2D matrix data stored in multiple underlying formats.
MIT License
69 stars 9 forks source link

Allow overriding of CSR accumulators with an `IndexLike` #181

Closed thetorpedodog closed 6 months ago

thetorpedodog commented 8 months ago

To support a custom indexer in TileDB-SOMA, this introduces a basic abstraction that is used to build the Indexer used by the CSR Accumulator. This can then be easily overridden by an implementation's custom subclass to swap that out without having to duplicate substantial parts of the code.

The naming is, admittedly, not ideal; this is in part due to the naming of the things we're trying to abstract over (the Pandas Index type which has the get_indexer method).


This is the somacore counterpart of https://github.com/single-cell-data/TileDB-SOMA/pull/1728. It is entirely compatible with current code, though; once we’re positive that it fufills the needs of the tiledb-soma reindexer, we can merge it and release it and tiledbsoma installations will continue to work fine.

beroy commented 8 months ago

Plz fix the formattings

beroy commented 7 months ago

@thetorpedodog can you plz merge this as my C++ reindexer is dependent on this?

thetorpedodog commented 7 months ago

I would like to wait until we get approval on the TileDB-SOMA side of the house. I am pretty confident that this is going to be it, but I don’t want to cut a release prematurely. It will be very quick—after merging this, we can immediately make the tag and everything will be ready in a matter of minutes. (There will be no rush to release TileDB-SOMA; this is fully compatible with existing TileDB-SOMA.)

thetorpedodog commented 6 months ago

We’re good to go!