EngreitzLab / gene_network_evaluation

Evaluation framework for computationally inferred gene networks from single-cell data.
8 stars 6 forks source link

How are we handling promoter vs enhancer motif outputs from evaluation? #21

Open adamklie opened 3 weeks ago

adamklie commented 3 weeks ago

We used to have a column for this in the output file for motif enrichment, but that is not there in the latest outputs (presumably because E2G links weren't there yet).

ProgramID EPType TFMotif PValue FDR Enrichment
K60_1 Promoter AHR 0.044631 0.210088 1.594955
K60_10 Promoter AHR 0.351685 0.67633 1.242518
K60_11 Promoter AHR 0.681555 0.885289 0.901666
K60_12 Promoter AHR 0.446282 0.745748 1.204339

How do we want to handle this more generally? The two ways I could see for the dashboard:

  1. Include a typecolumn for this regardless of what enrichment is run on.
  2. Output separate files for each type and name differently

1) seems better and more flexible, but either will work.

aron0093 commented 1 week ago

The config file for snakemake pipeline has separate paramters for P2G and E2G links and the pipeline would store outputs for these two separately with appropriate naming. The motif enrichment code itself will now accept any genomic coordinates mapped to genes and run the enrichment without requiring a "class" column in the input or the user to specify a specific class.

I would prefer 2. for the dashboard since the idea would be to use the pipeline outputs as the default input for the dashboard.

aron0093 commented 1 week ago

The code needs to be further modified to not expect a seq_class column to bring it in line with standard E2G output formats and therefore not require the user to manually add this column before running our pipeline.

adamklie commented 1 week ago

Here is our proposed format for output files from this step:

For the jamboree we will just run individually and save as such. Main idea here is to have a separate file for each that the dashapp will load in. There will be several dropdowns for the user to choose between what they want to visualize

adamklie commented 6 days ago

I've also implemented something I think we should discuss at some point. For convenience, I adjusted pvals after calculating all the pearson tests across programs. Would it be better to do FDR correction at a program level instead?