Closed PeterWang0502 closed 5 months ago
Self promoter elements having a score of 1 was an intentional change in the latest versions of ABC. To reproduce the 2021 paper, you'd have to go to a previous version of the repo: https://github.com/broadinstitute/ABC-Enhancer-Gene-Prediction/tree/master
Thanks for the quick reply! In case of genes having self promoter element with ABC scores of 1, how should we interpret the significance of the rest of their candidate elements (genic and intergenic)? In other words, what should be the ABC score threshold for selecting candidate elements as regulatory enhancers if we ignore self promoter elements?
The threshold varies based on your inputs. See https://abc-enhancer-gene-prediction.readthedocs.io/en/latest/usage/methods.html#interpreting-the-abc-score
Hello, I'm trying to replicate your prediction results for "CD4-positive_helper_T_cell-ENCODE" published in Nature at 2021: as shown in the lab page: https://mitra.stanford.edu/engreitz/oak/public/Nasser2021/AllPredictions.AvgHiC.ABC0.015.minus150.ForABCPaperV3.txt.gz, but I was unable do so with the information you provided in the paper.
I'm using ENCODE DNase-seq data of ENCFF198FEP and ENCFF905PBC, H3K27ac ChIP-seq data of ENCFF063OFD, ENCFF592HYA, and ENCFF728UTH, and the average Hi-C data as suggested in the Supplementary Table 2. Since ABC model only take sorted bam file as input for DNase and ChIP, I used samtools to sort the bam files downloaded from ENCODE. The codes are demonstrated as following:
samtools sort input.bam -o output.bam
samtools index output.bam
For some reason, the Enhancer Prediction results I got from running ABC model using these datasets have all "promoter" candidate elements exhibiting ABC score of 1. I've also tried running with only DNase-seq or without the average Hi-C data (using powerlaw instead), but I'm still getting similar results.
Is my ABC not running correctly or did I miss something? I am running abc-env. Thank you!