immunogenomics / symphony

Efficient and precise single-cell reference atlas mapping with Symphony
GNU General Public License v3.0
95 stars 22 forks source link

Question about finding variable genes with vst #3

Closed alitinet closed 3 years ago

alitinet commented 3 years ago

Hi,

I have a question about how you find variable genes in buildReference(): the default method is 'vst' and in the tutorial you run it on normalized counts, but both Seurat (see course code for FindVariableFeatures) and scanpy (docs for highly_variable_genes) use raw counts for this method.

Is there a reason why you run it on normalized counts?

Thanks.

joycekang commented 3 years ago

Hi! I think you could run it on raw counts to choose genes and that would work fine. You will need to use Approach 2 (the more modular approach with buildReferenceFromHarmonyObj) in the tutorial rather than buildReference to do so, though we will add a parameter to use raw counts in the future. We ran it on normalized counts because for the pancreas analysis we only had access to normalized data, and we wanted to keep it consistent. I believe the default in Seurat is that if raw counts are not available, FindVariableFeatures will look in the normalized counts slot. Hope that helps and let us know if you run into any issues!

alitinet commented 3 years ago

Thanks for the detailed answer! The second approach is what I was missing.