yezhengSTAT / CUTTag_tutorial

Tutorial Website
https://yezhengstat.github.io/CUTTag_tutorial/
49 stars 17 forks source link

Puzzling enrichment peak #5

Open Siteng999 opened 2 years ago

Siteng999 commented 2 years ago

We found in the experiment that some experimental data generated by the CUT & Tag method showed abnormal enrichment effect. We used a chromosome centromere-specific antibody to mark the centromere position. The existing successful chip-seq and partial cut-tag data showed significant enrichment. But it is puzzling that part of the CUT & Tag data, which should have appeared enrichment peaks in centromeres, shows an opposite enrichment trend, with enrichment peaks turning into enrichment valleys. Do you have the above problems and how to solve them ? Thank you in advance for your answer!

yezhengSTAT commented 2 years ago

Hello, Sorry for the delay! I was about to reply but disrupted by other issues.

My first feeling is that considering the sequences of centromere are relatively repetitive, there might be fewer uniquely mapping read that can be successfully aligned. Multi-mapping reads may end up unaligned or with low alignment quality score hence filtered. Rather than filter out any reads, you can choose one of the multi-mapping read alignments at random or go with the one that aligner, such as bowtie or bwa, chooses to get a fair representation of centromeric regions. If this is the human genome, the hg19 build is not appropriate, but the hg38 build is OK and the chm13 build is the best as it has the complete genome sequence.

Hope it helps.

Thanks, Ye

Siteng999 commented 2 years ago

Thank you very much for your reply!

I followed your suggestion to modify the analysis process, but the results did not change.

Excluding the data analysis level and antibodies, may this be due to our Cut&Tag experimental operations?

Thanks

yezhengSTAT commented 2 years ago

Yea......that is likely due to the experimental operations since you got partial CUT&Tag data working. Any specific experimental condition differences between data that have enrichment peaks and data that do not? How about the alignment rate? Are they comparable between those two sets of data?

Thanks, Ye