EngreitzLab / gene_network_evaluation

Evaluation framework for computationally inferred gene networks from single-cell data.
8 stars 6 forks source link

Query OpenTargets and GWAS Create Adjacency Matrix for GSEA #7

Closed anderseng closed 5 months ago

anderseng commented 5 months ago

I have added 2 notebooks to demonstrate how we can query OpenTargets with BigQuery and filter the resulting GWAS study to gene mapping. This step doesn't necessarily need to be re-done (unless we want to adjust the query or the filtering), but I do think it is important to document exactly how we did the query used to create the adjacency matrix.

Perhaps the more important part of this PR is that I have created a data frame (stored as generate_gwas_benchmarks/gwas_data/OpenTargets_L2G_Filtered.csv) and a companion function (stored in generate_gwas_benchmarks/2_create_adjacency_matrix.py that enables the user to read in the filtered dataframe and output a matrix that can be used for downstream enrichment analysis. This matrix can be either binary if a threshold is provided (e.g. L2G score >0.5) or it will be continuous by default.

Is this a good phase to get your help @aron0093 in figuring out how we want to format this data for GSEA so that your downstream functions will work?

Happy to add some more before reviewing if that would be easier. I haven't created any plotting functions yet, as I wanted to check in about fitting this in with the existing GSEA functionality.

anderseng commented 5 months ago

Closing to put the new functions into Snakemake format and in line with rest of repo style.