Closed pranithavangala closed 1 year ago
@pranithavangala, can you confirm you are using the most recent models? You can re-download them by celltypist.models.download_models(True)
Thank you I downloaded recent model, now I have many Heterogeneous cells (>60%)
@pranithavangala, when some homogeneous cell types are predicted together, it is possible that all of these cell types get similar high scores that will result in Heterogeneous. If your data have such spectrum of cell types, you can use the default mode to select/predict the cell type with the maximal likelihood.
@ChuanXu1 I think there is something weird going on with Immune_All_Low.pkl. When I use Immune_All_high.pkl I get most cells predicted as T cells but when I change the model to Immune_All_Low then I get most as Heterogeneous or unassigned
@pranithavangala, as mentioned in my previous comment, when you use Immune_All_Low, there are a lot of very similar cell types in the model (especially in the T cell compartment) which are assigned close scores using CellTypist (for example, 0.95 vs. 0.9). You will thus possibly get Heterogeneous
using a cutoff such as 0.5. You can try using the default mode (i.e., best
) to select/predict the cell type with the maximal likelihood.
@ChuanXu1 Thank you I chnaged to best match and now results are starting to make sense. One thing is a little still unclear when I use the "Immune_All_high" you can see most of the cells I have are classified as T cells, which is perfect. But when I use Immune_All_low in best match mode, it classifies a bunch of cells as B cells. Is it possible to restrict the cell type annotation in Immune_All_low model based on the parent cell type assigned from Immune_All_high. For example, to gain resolution I can subdivide my Tcells to various T cells types only and not include other cell types like B Fibroblasts etc
@pranithavangala
Is it possible to restrict the cell type annotation in Immune_All_low model based on the parent cell type assigned from Immune_All_high
Currently there is no connection between the high and low models, because if we connect them, an error in the high model (such as erroneously assigning a T cell to B cell) will be propagated to the low-level model (i.e., the low model only considers B cell subtypes).
For example, to gain resolution I can subdivide my Tcells to various T cells types only and not include other cell types like B Fibroblasts etc
It is usually not advisable to restrict the search scope of cell types during prediction. But if you are pretty sure you only have T cells in your data, you can manipulate the result as below
prob_matrix = predictions.probability_matrix
prob_matrix = prob_matrix.loc[:, prob_matrix.columns.str.contains("T cells")]
prob_matrix.idxmax(axis = 1)
Hi
Im trying to run celltypist on several different datasets (PBMCs, Spleenocytes etc) in almost all cases im getting more than 30% of cells being called as undetermined. I tried using some high quality public datasets as well but end up with same situation. Im using the low resolution Immune cell model. Can you help me understand what I can do to trouble shoot ?