kundajelab / DMSO

5 stars 3 forks source link

Analysis of TAD enrichment for differential ATAC peaks and genes #9

Open annashcherbina opened 7 years ago

annashcherbina commented 7 years ago

Analysis steps:

  1. Counted number of ATAC-seq peaks in each TAD region from hES dataset (http://chromosome.sdsc.edu/mouse/hi-c/download.html). Total number of ATAC-seq peaks from IDR merged across conditions was 129,441

  2. Counted number of expressed genes (TPM >=1) in each TAD region.

  3. Counted number of differential ATAC peaks and differential genes in each TAD region. The total number of differential ATAC peaks in the data was 660. The total number of differential genes was 413.

  4. Performed a permutation test to determine whether certain tads were statistically enriched for differential peaks/genes. I randomly selected 660 peaks from the full set of 129,441 and assigned them to tad regions. I performed this 1000 times to generate a distribution of the expected number of differential peaks in each tad region. Same was performed for genes -- 413 genes were randomly selected from the set of all expressed genes, repeated 1000 times, calculated expected distribution of differential genes in each tad.

  5. Computed z-scores to determine if the difference between observed and expected number of differential peaks/genes in each tad region is significant.

Results:

tad_enrichment.xlsx significant_tad_enrichment_check

The boxplots below include tad regions with >1 observed differential peak.

atac_tad_enrichment_check

The boxplots below include tad regions with >1 observed differential gene.

gene_tad_enrichment_check

In summary, we are not seeing any tad regions significantly enriched for differential peaks. We are seeing some tad regions enriched for differential genes though. I have most confidence in these 5:

chr2 | 30310000 | 30339999 chr12 | 16500000 | 16534999 chr4 | 23885000 | 23894999 chr9 | 17480000 | 17519999 chr14 | 13130000 | 13299999

Interestingly, there appear to be 4 tad regions that are significantly under-represented for differential genes, I am not certain what to make of these:

chr22 | 6165000 | 6189999 chr16 | 11060000 | 11074999 chr19 | 7960000 | 7969999 chr3 | 24895000 | 24919999