broadinstitute / ABC-Enhancer-Gene-Prediction

Cell type specific enhancer-gene predictions using ABC model (Fulco, Nasser et al, Nature Genetics 2019)
MIT License
203 stars 62 forks source link

Failure to replicate results #222

Closed PeterWang0502 closed 5 months ago

PeterWang0502 commented 6 months ago

Hello, I'm trying to replicate your prediction results for "CD4-positive_helper_T_cell-ENCODE" published in Nature at 2021: as shown in the lab page: https://mitra.stanford.edu/engreitz/oak/public/Nasser2021/AllPredictions.AvgHiC.ABC0.015.minus150.ForABCPaperV3.txt.gz, but I was unable do so with the information you provided in the paper.

I'm using ENCODE DNase-seq data of ENCFF198FEP and ENCFF905PBC, H3K27ac ChIP-seq data of ENCFF063OFD, ENCFF592HYA, and ENCFF728UTH, and the average Hi-C data as suggested in the Supplementary Table 2. Since ABC model only take sorted bam file as input for DNase and ChIP, I used samtools to sort the bam files downloaded from ENCODE. The codes are demonstrated as following: samtools sort input.bam -o output.bam samtools index output.bam

For some reason, the Enhancer Prediction results I got from running ABC model using these datasets have all "promoter" candidate elements exhibiting ABC score of 1. I've also tried running with only DNase-seq or without the average Hi-C data (using powerlaw instead), but I'm still getting similar results.

Is my ABC not running correctly or did I miss something? I am running abc-env. Thank you!

atancoder commented 6 months ago

Self promoter elements having a score of 1 was an intentional change in the latest versions of ABC. To reproduce the 2021 paper, you'd have to go to a previous version of the repo: https://github.com/broadinstitute/ABC-Enhancer-Gene-Prediction/tree/master

PeterWang0502 commented 6 months ago

Thanks for the quick reply! In case of genes having self promoter element with ABC scores of 1, how should we interpret the significance of the rest of their candidate elements (genic and intergenic)? In other words, what should be the ABC score threshold for selecting candidate elements as regulatory enhancers if we ignore self promoter elements?

atancoder commented 6 months ago

The threshold varies based on your inputs. See https://abc-enhancer-gene-prediction.readthedocs.io/en/latest/usage/methods.html#interpreting-the-abc-score