Sage-Bionetworks / NF_LandscapePaper_2019

This repository hosts all the code used to generate analyses and figures for the landscape paper
3 stars 1 forks source link

LV Drug KS test #49

Closed allaway closed 4 years ago

allaway commented 4 years ago

We wanted to get a sense of which LVs were druggable or not, so I ran a gsea-like analysis on each latent variable, treating the "universe of druggable targets" as a gene set to determine enrichment in.

The p-values don't make much sense to me, but I think this might be because each LV only has a subset of the full list of 1500 druggable targets in it, even if there are many druggable targets in each LV. I could also try the reverse analysis, where my gene sets are the LVs, and the input is the list of druggable targets, but I'm not sure that would change much, and there is not a great way to rank the targets...

I'm curious @jaclyn-taroni or @cgreene - have you all tried this type of analysis to get more information about each LV? Looking at the og PLIER paper, they too used GSEA to assess whether LVs were enriched in known pathways, but I don't think this

http://htmlpreview.github.io/?https://raw.githubusercontent.com/Sage-Bionetworks/NF_LandscapePaper_2019/a877bdc5d912463aa9eafe6d97e157d5c1e0b301/results/18-lv-drug-target-enrichment.html

Based on the this analysis, I don't think using a weighted k-s test is the right approach to assess whether individual drugs have targets that are enriched in a lv, where the LV is the "gene set", because we don't have a great target ranking metric, and the LVs are pretty large. I suppose we could filter LV genes by loading, but I'm not sure how well a KS test would behave on comparing, say, 3 drug targets to a LV of 50 genes. I think a different test is needed here. Perhaps a mann-whitney test? what do you think?

Either way - I'd like to close out this analysis and create another notebook for whatever comes next.

This PR also contains an updated lockfile with newer package versions and also adds clusterProfiler, and defines the Sage RAN as a source so that renv can find synapser and pythonembedinR.

Closes #41

allaway commented 4 years ago

I think the next natural step is to do a breakdown by drug AND lv (double group by?) to see if any particular drug is enriched.

What do you think about this?

Based on the this analysis, I don't think using a weighted k-s test is the right approach to assess whether individual drugs have targets that are enriched in a lv, where the LV is the "gene set", because we don't have a great target ranking metric, and the LVs are pretty large. I suppose we could filter LV genes by loading, but I'm not sure how well a KS test would behave on comparing, say, 3 drug targets to a LV of 50 genes. I think a different test is needed here. Perhaps a mann-whitney test? what do you think?

thanks for reviewing!