theislab / sfaira

data and model repository for single-cell data
https://sfaira.readthedocs.io
BSD 3-Clause "New" or "Revised" License
135 stars 11 forks source link

Multimodal data #391

Open grst opened 3 years ago

grst commented 3 years ago

Does sfaira already deal with multimodal data in some way, such as

davidsebfischer commented 3 years ago

We are preparing for this, in parallel to cellxgene schema 2.0 release. By point:

Importantly, all of these can already be loaded in the load function, but they will not be considered during streamlining until these features are added. So we add new supported modalities into existing loaders easily.

grst commented 3 years ago

Sounds good, I will definitely check out the cellxgene schema.

w.r.t. the VDJ columns, I would recommend to name the columns as defined by the AIRR rearrangement standard. The columns that are most useful for VDJ analyses and that I import into scirpy be default are

"productive",
"locus",
"v_call",
"d_call",
"j_call",
"c_call",
"junction",
"junction_aa",
"consensus_count",
"duplicate_count",

(see _io.py#L36-L47)