Open grst opened 1 year ago
As discussed with @zktuong, it would be nice to refer to the dandelion preprocessing workflow (which addresses some issues with the cellranger output) from this package and/or scirpy. In the end, this shouldn't be hard, as the dandelion pipeline reads cellranger output and writes AIRR, which can directoy be consumed by the read_airr
function.
tagging @DennisCambridge
In the scverse core team the consensus was reached that IO should not be part of the analysis packages (e.g. scanpy, scirpy, muon), but rather in an independent package with minimal dependencies and have the analysis packages depend on it. The hope is that this leads to a wider adoption of scverse datastructures, since the "dependency cost" of depending on a lightweight IO packages is lower than depending on an entire framework. This issue is to track the goal of creating such a package for scirpy.
Name (?)
A couple of ideas
Scope
read_xxx
andwrite_xxx
functions inscirpy.io
AirrCell
,to_airr_cells
andfrom_airr_cells
functions(ideally dandelion adapts the scverse datastructure. Otherwise these functions should live in dandelion itself)to/from_dandelion
Maybe
merge_airr
index_chains
get.airr
The latter two go beyond just storing AIRR data as an awkward array, but implement the scirpy receptor model. But they are likely useful for some other packages. But then again if a method needs this, they could just depend on the full scirpy.
In case of doubt, err on the side of including less in the package, as it could be added later if required.