Closed Dazcam closed 2 years ago
Hi, For each gene, "pct" calculates % of cells in each cluster with detectable expression. This is then summarized for all genes to reach a "score".
Hi, and thanks for responding.
That makes sense, but how is that score summarised exactly?
Is it right to assume no formal statistical test is run here?
And right to assume that it is the summarised gene 'score' that is compared between the test and query datasets? Or to put it another way, all cells in a cluster in the query dataset are assigned the identity of the cluster in the reference with the most similar 'score'.
Many Thanks.
for instance, if reference type A has 3 marker genes, and in cluster 1 the detection percentages are 0, 0.1, 0.5, the default is just to take mean, so 0.2 is the "score" for cluster 1.
yes, no statistical testing, more for exploration
I have a set of gene lists that I'm using for cell type classification and have run
clustify_lists()
as described in your tutorial in which the metric parameter is set to 'pct'.I'm struggling to find a thorough explanation of what is being run under the hood when this the 'pct' option is used. All I could find in the paper was:
Could you elaborate on the last (pct) part a bit? It's not 100% clear (at least to me) what is going on, particularly with regard to whether you are comparing marker genes across cells or clusters (or both).
Many thanks.