immunogenomics / harmony

Fast, sensitive and accurate integration of single-cell data with Harmony
https://portals.broadinstitute.org/harmony/
Other
525 stars 99 forks source link

How to avoid overcorrection when cell states present only in a few of the samples #78

Closed kaizen89 closed 3 years ago

kaizen89 commented 4 years ago

Hi, Thanks for making this great tool. I am trying to use it to merge samples from different patients, however, I found that there is clearly some overcorrection for some of the cells. I have 3/12 patients with both tumor and normal sites (juxta) cells which have some differences in the expression of some genes, 9 other patients with only tumor site cells. But when using harmony with theta=1, lamda=2, the normal cells and the tumor cells are clustered in the same group. You can see example below for cluster 0 (in blue), the bar below the cluster bar corresponds to the different patients (each color is a patient). Has anyone seen this before and is there a way to avoid it?

image

ilyakorsunsky commented 3 years ago

Dear @kaizen89,

Overcorrect is an important and difficult issue to address. I discuss a similar question in Issue #101.

Best, Ilya