ncborcherding / escape

Easy single cell analysis platform for enrichment
https://www.borch.dev/uploads/screpertoire/articles/running_escape
MIT License
139 stars 20 forks source link

Is it possible to dig into relative contributions of various genes to the enrichit score #14

Closed tomthomas3000 closed 2 years ago

tomthomas3000 commented 3 years ago

This is a great package - thank you for making it!

I was wondering whether it would be possible to look at the relative contribution of individual genes to the enrichIt score by any chance? Thank you.

ncborcherding commented 3 years ago

Hey Tom,

Interesting idea - do you have an example of what you have in mind?

Thanks, Nick

tomthomas3000 commented 3 years ago

For ex: out of a set of 20 genes supplied to the enrichit function, perhaps only 4-5 genes play an overwhelmingly important role in the enrichit score output. If this was the case, would it be possible identify the relative contribution of the individual genes to the score?

daccachejoe commented 3 years ago

hi - I am also wondering this. Any updates?

ncborcherding commented 3 years ago

Hey thanks for the question - working on a major overhaul of the package that will have the enrichment plot function. I do not have a timeline yet though.

ncborcherding commented 2 years ago

This took quite awhile (actually a lot harder than I thought), but the newest dev version of escape has enrichmentPlot() a function to examine the distribution of ranked gene order across single-cell groups.

https://ncborcherding.github.io/vignettes/escape_vignette.html#55_6_Enrichment_Plots

As of right now it just shows the mean rank across the group - I will work on adding the enrichment score and p-value calculation directly to the plot.

tomthomas3000 commented 2 years ago

Hope you are keeping well!

This is a fantastic addition to the package. I had another query if that’s alright?

I have pseudobulked counts within single cell samples essentially ‘in-silico’ bulking them. I have a list of genes that I wanted to ‘score’ these samples for. Is it acceptable to use escape to score these in-silico bulked samples for the list of genes, and hence derive a vector for each sample? I would be keen to hear your thoughts!

Kind Regards, Tom From: theHumanBorch @.> Reply to: ncborcherding/escape @.> Date: Friday, 24 September 2021 at 18:50 To: ncborcherding/escape @.> Cc: Tom Thomas @.>, Author @.***> Subject: Re: [ncborcherding/escape] Is it possible to dig into relative contributions of various genes to the enrichit score (#14)

This took quite awhile (actually a lot harder than I thought), but the newest dev version of escape has enrichmentPlot() a function to examine the distribution of ranked gene order across single-cell groups.

https://ncborcherding.github.io/vignettes/escape_vignette.html#55_6_Enrichment_Plots

As of right now it just shows the mean rank across the group - I will work on adding the enrichment score and p-value calculation directly to the plot.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ncborcherding/escape/issues/14#issuecomment-926814643, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQMOCDNSI7YUOWOVEJAW2OTUDS25ZANCNFSM4XBM36UQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

ncborcherding commented 2 years ago

Hey Tom,

Absolutely - I think there are two approaches right now in the field for gene set for single-cell data, 1) pseudobulk and 2) true single-cell enrichment. There are some pluses/minuses to both approaches - I think pseudobulk produces more robust enrichment results for example. The calculation of enrichment at a single-cell level is probably better for the analysis of heterogeneity. You can use escape for either approach.

Let me know if you have any other questions. Nick

tomthomas3000 commented 2 years ago

Dear Nick,

Thank you so much for your response.

Another follow up question if you don’t mind. I am currently pseudobulking all genes (sum of all counts) in a sample of single cell data, and then running escape straight away to derive a vector for the scored gene list.

Do you think normalisation is required before ‘scoring’ samples as such? Thanks.

Kind Regards, Tom

From: theHumanBorch @.> Reply to: ncborcherding/escape @.> Date: Friday, 1 October 2021 at 17:34 To: ncborcherding/escape @.> Cc: Tom Thomas @.>, Author @.***> Subject: Re: [ncborcherding/escape] Is it possible to dig into relative contributions of various genes to the enrichit score (#14)

Hey Tom,

Absolutely - I think there are two approaches right now in the field for gene set for single-cell data, 1) pseudobulk and 2) true single-cell enrichment. There are some pluses/minuses to both approaches - I think pseudobulk produces more robust enrichment results for example. The calculation of enrichment at a single-cell level is probably better for the analysis of heterogeneity. You can use escape for either approach.

Let me know if you have any other questions. Nick

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ncborcherding/escape/issues/14#issuecomment-932379603, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQMOCDNJQO7RB4ZVDZEOYHTUEXPIZANCNFSM4XBM36UQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

ncborcherding commented 2 years ago

Hey Tom,

Both the "ssGSEA" and the "UCell" method use raw count data. No need for normalization before running.

Nick