Quick question about prediction result

JackieHanLab / TOSICA

Transformer for One-Stop Interpretable Cell-type Annotation

MIT License

121 stars 23 forks source link

Quick question about prediction result #3

Closed HelloWorldLTY closed 1 year ago

HelloWorldLTY commented 1 year ago

Hi, I find that in the output umaps of predict results, nearly more than 50% acinar cells are predicted as ductal cell, and also there is a new cell type in the delta part. Is it the problem of model design or the similarity of these cell types? Thanks a lot.

HelloWorldLTY commented 1 year ago

Hi, I have another question about the attention score. May I know how to extract the attention weights here to find marker genes for different cell type labels? Thanks a lot.

JackieHanLab commented 1 year ago

Thanks for your interest about our tool. Actually, acinar cells and dutcal cells are naturally similiar and you can see the Supplementary 9 result in our paper that other annotators are also confused with these two cell types. And for delta cells， bescaue there are T cells in the reference dataset, it is slightly probably for TOSICA to annotate cells in query as T cells.

The method to extract the attention is Attention Rollout. This approach comes from Quantifying Attention Flow in Transformers. We didn't try to use attention weights to find marker genes, but we used scanpy.tl.rank_genes_groups(method='wilcoxon') to fine marker pathways for different cell types.

Hope the answer can help you!

HelloWorldLTY commented 1 year ago

Hi, thanks for your answer. Actually I do not think the problems coming from the similarity of such cells. From atlas level research, these two cells are not very similar. See:

The source is from: https://www.sciencedirect.com/science/article/pii/S2405471216302927

I think the demo datasets may not be annotated well. For example, if I directly run cluser method like ledien based on this query dataset, the cells originally labeled by 'ancinar' are almostly labeled by one cluster marker.

Thanks for your work about attention, I will take a look at it and have a try.