Closed llumdi closed 2 years ago
Was fixed with bc.tl.sig.obtain_dblabel
Sorry, I did not explain it correctly. I was not refereing to the conversion that already exists in bc.tl.sig.obtain_dblabel
and which is used during the celltype annotation notebook that requires cnames and a dataframe. I was referring to the ability of converting the dblabel
column in adata.obs
(or any column name given valid dblabels) to either the short or long name version (as in bescaviz).
Eg: If I load a study I have these names:
set(adata.obs['dblabel'])
{'CD1c-positive myeloid dendritic cell',
'CD4-positive, alpha-beta cytotoxic T cell',
'CD8-positive, alpha-beta cytotoxic T cell',
'basophil',
'central memory CD4-positive, alpha-beta T cell',
'classical monocyte',
'cytotoxic CD56-dim natural killer cell',
'doublet',
'effector memory CD4-positive, alpha-beta T cell',
'effector memory CD8-positive, alpha-beta T cell',
'gamma delta T cell',
'hematopoietic stem cell',
'monocyte',
'mucosal invariant T cell',
'naive B cell',
'naive thymus-derived CD4-positive, alpha-beta T cell',
'naive thymus-derived CD8-positive, alpha-beta T cell',
'neutrophil',
'non-classical monocyte',
'plasma cell',
'plasmacytoid dendritic cell',
'platelet',
'regulatory T cell'}
But I would like to plot the short version names. How can I do that? Thanks, Ll
Ok. I ll have a look. Thank you for pointing it out.
Hi @llumdi , could you check the last commit on the signature_revision_branch ? And tell me if this is what you had in mind ?
As a ECM, once the commit (https://github.com/bedapub/besca/commit/682288c686eddfe0964f3b333e2dc425a3a9340b) is checked out:
import scanpy as sc
import besca as bc
adata = bc.datasets.pbmc3k_processed()
sc.pl.umap(adata, color = 'celltype3')
matching_v = bc.tl.sig.match_label(adata.obs.get( "celltype3"), '../' + "/besca/datasets/nomenclature/CellTypes_v1.tsv")
adata.obs['short'] = adata.obs.get( "celltype3").map( dict(matching_v.values))
sc.pl.umap(adata, color = 'short')
## Checking out the conversion table
matching_v
Thanks for implementing the new function Alice. I tested it and works perfectly (I will send you the report with different scenarios). Just a small suggestion: only return as an error the not found category instead of the list of all values, which can be very big.
Just to confirm that from the development version (signature_revision) it also worked calling directly the function bc.tl.sig.match_label( )
instead of copying the code.
@llumdi , printing issue should be fixed in a8052d7.
Would be nice to have a function to convert the column names (therefore category labels on a plot) from dblabel to short name and viceversa. For plotting is more convenient to have short names but it could print a table with the conversion (dblabel and short name columns) so that it can be used as a glossary attached to the report.