satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.26k stars 904 forks source link

Ability to perform DE on module scores? #3719

Closed samuel-marsh closed 3 years ago

samuel-marsh commented 3 years ago

Hi,

Not necessarily a bug report and not quite enhancement depending on your thoughts. I'm wondering if it's possible to adapt the current FindMarkers function to perform DE testing on module score results and whether you think this would be a valid analysis? For instance comparing the module scores either between clusters or between experimental conditions.

Currently, when trying to run FindMakers on module score this is error that pops up (Seurat v3.2.2; R3.6.1; MacOS Catalina):

de_res <- FindMarkers(object = seurat_object, ident.1 = "group1", ident.2 = "group1", features = c("module_score_name"))
Error in intI(i, n = d[1], dn[[1]], give.dn = FALSE) : 
  invalid character indexing

I suspect this is because of issue due to fact that module scores are stored in meta data and not with the gene level data? I would welcome your opinion as to whether you think such DE tests are valid on module scores and if so whether current FindMarkers function could be adapted to support this when module scores are explicitly supplied through features parameter?

Thanks! Sam

andrewwbutler commented 3 years ago

Hi Sam,

Yeah, you're correct about the error and that FindMarkers operates on the Assay level data. I don't think we would want to broadly enable DE tests on meta.data columns as users can store just about anything in those and some of the assumptions for the DE tests may not be always appropriate. The goal here would be to test whether a particular module score is significantly different between two groups of cells? It would take a couple lines of data wrangling but I think I would recommend just extracting the meta.data for the module score of interest and the cell grouping variable and running something like wilcox.test directly to test for differences.

samuel-marsh commented 3 years ago

Hey Andrew,

Sounds good, kinda figured as much in terms of enabling meta.data testing. I figured I might have to do that directly but worth a shot checking with you guys too. I'll play around with running that directly.

Thanks, Sam

Marc-Benoit commented 1 year ago

Hi @samuel-marsh,

Came across this thread searching for the same question - looking to compare a module score between 2 genotypes within cell types and get stats for that comparison. The difference is obvious when visualizing on a violin plot but would like some official test to run. Have you been able to extract the module score by genotype within cell type from the meta data and run a Wilcox test?

Thank you! Marc

shuvamc95 commented 1 year ago

@Marc-Benoit were you able to find a solution to this?

samuel-marsh commented 1 year ago

Hi @Marc-Benoit @shuvamc95,

Yes, I think that within a cluster it’s valid comparison to make. It’s important though to subset the cluster before running module score because cellular composition can effect the scoring and choose of control genes.

After that just extracted scores from meta data and ran standard Wilcoxon on the data.

That will tell you whether scores are different between two groups in a cluster but it doesn’t mean that scores are necessarily enriched in cells though.

Best, Sam

giorgiatosoni commented 1 year ago

Hi,

came across this thread and I was wondering what would be the best way to test with statistics the performance of two sets of markers in the same dataset. In this case, I would assume that cellular composition would not affect the result because I'm testing 2 different lists on the same dataset, so no need to subset.

Let's say I have lists A and B, both containing markers for astrocytes, and a dataset with annotated astrocytes. If I want to test if list A is better in identifying astros in the dataset than list B, would you do a Wilcox test between the scores obtained using list A and B? or better count all the cells with a score > 0.2 (example) and to a chisq on the proportions?

I would like to hear your thoughts on this, thanks :)

Best,

Giorgia