Positive correlation between the identity score of the right identities and the rest

Thank you for this great package, it has worked very good in my hands with the identities in my mouse brain PFC dataset correctly assigned.

However, looking more carefully into the data I was surprised by the following obvservation: cells with a high "score" for their assigned identity also show a high score for the rest ("wrong") of identities and viceversa, as shown in the plot.

correlation

I would have expected the opposite behaviour: those cells with a higher score for their right identity should have a lower score for the "wrong" identities, as I would expect their transcriptomes to differ more. This same observation is replicated using different parameters in your singleR and also using my custom set of markers.

Do you have any thoughts on this?

Thank you in advance!

dviraran / SingleR

Positive correlation between the identity score of the right identities and the rest #159