Closed HelloWorldLTY closed 1 year ago
Hi, I find that in the output umaps of predict results, nearly more than 50% acinar cells are predicted as ductal cell, and also there is a new cell type in the delta part. Is it the problem of model design or the similarity of these cell types? Thanks a lot.
Hi, I have another question about the attention score. May I know how to extract the attention weights here to find marker genes for different cell type labels? Thanks a lot.
Thanks for your interest about our tool. Actually, acinar cells and dutcal cells are naturally similiar and you can see the Supplementary 9 result in our paper that other annotators are also confused with these two cell types. And for delta cells, bescaue there are T cells in the reference dataset, it is slightly probably for TOSICA to annotate cells in query as T cells.
The method to extract the attention is Attention Rollout
. This approach comes from Quantifying Attention Flow in Transformers. We didn't try to use attention weights to find marker genes, but we used scanpy.tl.rank_genes_groups(method='wilcoxon')
to fine marker pathways for different cell types.
Hope the answer can help you!
Hi, thanks for your answer. Actually I do not think the problems coming from the similarity of such cells. From atlas level research, these two cells are not very similar. See:
The source is from: https://www.sciencedirect.com/science/article/pii/S2405471216302927
I think the demo datasets may not be annotated well. For example, if I directly run cluser method like ledien based on this query dataset, the cells originally labeled by 'ancinar' are almostly labeled by one cluster marker.
Thanks for your work about attention, I will take a look at it and have a try.