open2c / coolpuppy

A versatile tool to perform pile-up analysis on Hi-C data in .cool format.
MIT License
77 stars 11 forks source link

cooltools pileup for CTCF and RAD21 #125

Closed odovgusha closed 1 year ago

odovgusha commented 1 year ago

State the question Hi, I am working with human Hi-C data and want to aggregate the signal between PRC2, CTCF, or RAD21 peaks. I only managed to do this plot for PRC2 because I found a tip in the HICExplorer tutorial and used only interactions which were longer than the average human TAD size (~1600000). However, I struggle to get any signal for CTCF or RAD21 peaks (only background). I tried to use different datasets for the same cell type (both Hi-C and Chip-seq) and it did not help.

Could you suggest whether there are any particular filtering steps that should be applied to CTCF or RAD21 or any settings which I should apply in the tool itself? i.e. looking for TAD boundaries ChIP-seq. or using interactions between 200000-1000000 bp (coolpuppy tutorial).

I did not find any particular tutorial on this and I am new to Hi-C data. Thus, working with Hi-C data is not intuitive for me. Could suggest any source where I can find a tip on these plots or give a suggestion yourself? Which parameters should I think about in the first place when trying to aggregate interactions between ChIP-seq peaks?

Thanks!

Phlya commented 1 year ago

Choosing a reasonable distance should help, but also annotating and using CTCF motif orientations helps get a much clearer picture. See the tutorial for that.

odovgusha commented 1 year ago

Thanks! Could you please share the motif and the settings for the gimme motifs scan? I struggle to replicate you results that you have got in your CTCF.bed file. Thus, I cannot be sure that I will get a proper orientation for my CTCF peaks. My script:

gimme scan test_CTCF.bed/CTCF_hits-in-peaks_ENCFF498QCT.bed -p MA0139.1.pfm -g test_CTCF.bed/hg38.fa -b --nreport 1 --cutoff 0 > CTCF_motifs3.bed

Phlya commented 1 year ago

I think @agalitsyna created the CTCF file used here and in cooltools docs, but I am not sure. Perhaps she can comment?

Phlya commented 1 year ago

What I can suggest is what I have done in the past: after annotating motifs, take only peaks in which motifs are all in the same orientation. CTCF very often has two divergent motifs within one ChIP-seq peak, and that would through off the analysis.

odovgusha commented 1 year ago

Thank you for the swift reply! Also, the tutorial clearly shows that the orientation of CTCF has a huge impact on the analysis. Guess that I just need to properly mark the CTCF orientation in each peak and will be ready to apply it to my data.

Phlya commented 1 year ago

Assume this is resolved, feel free to reopen.