Closed emdann closed 2 years ago
@emdann
brilliant! Thank you
@emdann, as I tested, some Ensembl IDs can match multiple gene symbols, it's not so intuitive to store Ensembl IDs along with gene symbols in a single model. Moreover, as well as Ensembl IDs, the users may have other needs (HGNC ID, old gene symbols, etc.)
Therefore, there is a convert method, which is initially designed to convert human/mouse model to mouse/house model by mapping orthologous genes. This method can be also used for the Ensembl ID case, where the users provide a map file to convert the genes in the model to other formats (Ensembl IDs, HGNC, orthologous genes, ...)
model = celltypist.Model.load("some_model.pkl")
model.convert(map_file = 'symbol2ID.csv')
#the map file provided by the user based on what they'd like to transform the gene symbols to
model.write("/path/to/converted_some_model.pkl")
I think this should be a better way to deal with the case you encounter.
Also see the Usage -> Supplemental guidance -> Cross-species model conversion
Hello celltypers,
While using a trained celltypist model on my data, I got this error. It took me a little while to realise it was coming from having mismatched feature names: my
adata.var_names
are EnsemblIDs while the model uses gene names.This made me think of two suggestions:
celltypist.annotate
?