broadinstitute / ABC-Enhancer-Gene-Prediction

Cell type specific enhancer-gene predictions using ABC model (Fulco, Nasser et al, Nature Genetics 2019)
MIT License
203 stars 62 forks source link

Change some prediction column names of ABC #202

Closed atancoder closed 8 months ago

atancoder commented 8 months ago

Change the column names to make things more clear for downstream applications

See column renaming here

![image](https://github.com/broadinstitute/ABC-Enhancer-Gene-Prediction/assets/10254642/dd10f0d2-2b95-4f94-934b-da2578f827cc)

Test Plan

Ran chr22 w/ DHS + H3k27ac

(abc-env) [atan5133@sh03-ln05 login /oak/stanford/groups/engreitz/Users/atan5133/ABC-Enhancer-Gene-Prediction]$ zcat results/K562_chr22/Predictions/EnhancerPredictionsAllPutative.tsv.gz | head -1
chr     start   end     name    class   activity_base   activity_base_enh       activity_base_squared_enh       normalized_dhs_enh      normalized_h3k27ac_enh      TargetGene      TargetGeneTSS   TargetGeneExpression    TargetGenePromoterActivityQuantile      TargetGeneIsExpressed   TargetGeneEnsembl_ID    normalized_dhs_prom normalized_h3k27ac_prom distance        isSelfPromoter  powerlaw_contact        powerlaw_contact_reference      hic_contact     hic_contact_pl_scaled       hic_pseudocount hic_contact_pl_scaled_adj       ABC.Score.Numerator     ABC.Score       powerlaw.Score.Numerator        powerlaw.Score  CellType    hic_contact_squared

Ran chr22 w/ DHS only Note that there's no h3k27ac columns

(abc-env) [atan5133@sh03-ln05 login /oak/stanford/groups/engreitz/Users/atan5133/ABC-Enhancer-Gene-Prediction]$ zcat results/K562_chr22/Predictions/EnhancerPredictionsAllPutative.tsv.gz | head -1
chr     start   end     name    class   activity_base   activity_base_enh       activity_base_squared_enh       normalized_dhs_enh      TargetGene      TargetGeneTSS       TargetGeneExpression    TargetGenePromoterActivityQuantile      TargetGeneIsExpressed   TargetGeneEnsembl_ID    normalized_dhs_prom     distance    isSelfPromoter  powerlaw_contact        powerlaw_contact_reference      hic_contact     hic_contact_pl_scaled   hic_pseudocount hic_contact_pl_scaled_adj   ABC.Score.Numerator     ABC.Score       powerlaw.Score.Numerator        powerlaw.Score  CellType        hic_contact_squared

Full ABC genomewide CRISPR Benchmark: For good measure, i tested different combos (DHS only, ATAC + H3K27ac) against the reference predictions that we've made before. Results are exactly the same.

image

atancoder commented 8 months ago

Tests pass locally but fail in circleCI due to timeout. Speeding up tests in another PR