satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.29k stars 914 forks source link

Question about findmarkers() argument latent.vars #9236

Open Flu09 opened 2 months ago

Flu09 commented 2 months ago

I have 3 questions, I am hoping you can help me. 1)In case I want to remove some latent variables such as Sex, but it is not available for all of my samples. The Sex for some samples is NA in the df in the metadata of my seurat object.

Do I still include the variable Sex in latent.vars or do I need remove the samples with the unknown variable Sex?

Do you suggest adding values for them as "unknown" instead of leaving them empty (NA) ? then proceed with including Sex as a variable or remove those samples?

2)in case I use something such as sample or donor does findmarkers() actually find the difference between the 2 idents only i specified disease vs control and cancel the effect of anything else such as sex or age or would it cancel everything?

3)I also want to ask about nCount and nGene as latent.vars() does it make sense to include them?

DarioS commented 2 months ago

Uh, couldn't you impute the value based on gene expression of genes on chrY, excluding the pseudoautosomal regions? nCount and nGene could be associated to cellular biology. For instance, A cancer cell with whole genome duplication is going to have a whole lot more reads than a diploid cell. That is probably not unwanted variation to be adjusted for.

Flu09 commented 2 months ago

@DarioS That is a good idea thank you so much. What about batch as a variable ( because different runs or sequencing platforms). Do you think it makes sense or is it better to add biological ones only?