MSingerLab / COMETSC

COMET Single-Cell Marker Detection tool
BSD 3-Clause "New" or "Revised" License
31 stars 7 forks source link

Confusion about ranks #9

Open CodeInTheSkies opened 3 years ago

CodeInTheSkies commented 3 years ago

Hi there,

Great tool! Thank you for making this tool.

I have a basic question, and hope to get some clarification through this forum. I was going through the example output you have described here:

https://hgmd.readthedocs.io/en/latest/Output.html

In the TSNE plots on that page, you have shown Cd74 and Fcer1g_c. But the rankings shown are way too low, but you still say that they are among the top ranked ones. How is that? What am I missing in my understanding? Do the ranks shown represent single-gene rankings then? Is this why they are low? So, although the single rankings are low, these are good as pairs. Is this why they are still shown as examples?

Would very much appreciate some explanation so that I can clearly understand what these rankings mean. Both in the context of singletons and pairs.

Thank you!

CodeInTheSkies commented 3 years ago

Just checking in to see if somebody could answer this! Thanks a lot, and happy holidays!

oshahid commented 3 years ago

Hi @CodeInTheSkies,

Thanks so much for your question and I'm sorry this message is so many months late. Please feel free to reach out directly at oshahid@ds.dfci.harvard.edu if this happens again, and I'll be quick to your response in the future.

As for your question, when looking at the discrete_pairs (in this case, Cd74+Fcer1g_negation), the paired plot and the singleton plots are shown. In this example, Cd74 (rank 112) and Fcer1g_negation (rank 5995) individually do not have a high rank, and individually are not great markers for this particular cluster. However, when paired together, they outperform all other possible gene pairs in the dataset (rank 1).

Again, sorry that this is so late. Please let me know if this cleared up any issues.