scverse / scirpy

A scanpy extension to analyse single-cell TCR and BCR data.
https://scirpy.scverse.org/en/latest/
BSD 3-Clause "New" or "Revised" License
205 stars 32 forks source link

ir.tl.chain_qc doesn't mark cells with no IR #452

Open a-munoz-rojas opened 10 months ago

a-munoz-rojas commented 10 months ago

Describe the bug 'ir.tl.chain_qc' states that it should mark cells that don't have any detected immune receptor (as stated in the docs). In the new data structures, this information should be in the "airr:recepetor_type", etc slots, annotated as "no IR". However when you run this function, cells that are lacking and IR are just annotated with nan values, so you can't see how many cells don't have an associated IR when plotting.

To Reproduce

import muon as mu
import scirpy as ir

mdata = ir.datasets.wu2020_3k()
adata = mdata['gex'].copy()
adata_tcr = mdata['airr'].copy() 
adata_tcr = adata_tcr[0:-100,:].copy() #artificially remove tcr info from last 100 cells

mdata = mu.MuData({"gex": adata, "airr": adata_tcr})
ir.pp.index_chains(mdata)
ir.tl.chain_qc(mdata)

mdata.obs["airr:receptor_subtype"].tail() #visualize the info on last cells - they are stored as nans

#plot the subtypes
_ = ir.pl.group_abundance(
    mdata, groupby="airr:receptor_subtype", target_col="gex:source"
)

Expected behaviour Cells with no IR should be annotated as "no IR", according to docs (https://scirpy.scverse.org/en/latest/generated/scirpy.tl.chain_qc.html)

System

Additional context

grst commented 10 months ago

Hi,

thanks for reporting this! I believe the reason is that chain_qc operates on the mdata["airr"] slot which obviously only contains cells with a receptor. Writing back the information to mdata.obs coerces nan for cells that are not in mdata["airr"].

It should be possible to fix this pretty easily.

grst commented 8 months ago

I fixed the chain_qc function to also compute values for cells not in the AIRR modality in https://github.com/scverse/scirpy/pull/463. But plotting the result requires changes to the group_abundance function that I'll tackle together with overhauling completely how barplots are generated (which is planned for a while, see https://github.com/scverse/scirpy/issues/232)