immunogenomics / symphony

Efficient and precise single-cell reference atlas mapping with Symphony
GNU General Public License v3.0
99 stars 22 forks source link

Harmony with one variable having the same value for all cells #34

Open YOU-k opened 2 years ago

YOU-k commented 2 years ago

Hi there, I am trying to build a reference based on tumor cells from this small cell lung cancer dataset, https://cellxgene.cziscience.com/collections/62e8f058-9c37-48bc-9200-e767f318a8ec.

They calculated PCs based on MNN obtained with all cells (both immune and tumor cells) used. But here, if I want to use symphony, I must build a harmony object. So, the first question is, is it possible to use other dimension reduction instead of harmony?

Since I still want to use symphony to have a check, I run harmony with one variable having the same value for all tumor cells, which means that all cells are assumed to come from a single batch. The second question here is, do you think running harmony in the way is OK?

Happy to hear your suggestions. Cheers, Yue

joycekang commented 1 year ago

Hi Yue,

Apologies for the delay in getting back to you.

The underlying Symphony model is built upon the linear mixture model framework in Harmony, so the input must be a Harmony-corrected embedding. With that said, the starting embedding does not need to be PCA (we give an example using CCA on a multimodal dataset in the paper).

If there is no batch structure in the reference, then yes, you can just run Harmony with all cells from the same batch and it should be fine. If there are multiple donors (even if a small number of donors), we would still generally recommend correcting donor-specific effects.