YeoLab / merge_peaks

Pipeline for using IDR to produce a set of peaks given two replicate eCLIP peaks
9 stars 7 forks source link

Question on input normalization #16

Closed eric-d-larson closed 2 years ago

eric-d-larson commented 2 years ago

I'm working through the analysis of an eCLIP experiment using your pipelines. Everything got completed including the merge peaks pipeline. However, a graduate student in the lab pointed out something in regards to input normalization. In some cases, we are seeing read coverage in the IP samples but not the input samples for certain reproducible peaks. When I look through the output files (full files after input normalization), I see that the input shows a read of '1' despite there being no coverage when looking at a BedGraph file for the input. I've looked through some of the perl scripts to see if I can figure out what's happening but I'm not fluent enough with perl to get a grasp on this. My thought is that if there are reads in the IP samples but not the input, the input is assigned a read count of 1. Perhaps if we sequenced deeper on the input, we'd see some small coverage.
One caveat to our experiment is that we had a fairly low sequencing depth overall. Attached is an example of our files in IGV.

image

byee4 commented 2 years ago

Hi Eric, yes that appears correct. If there are no reads in the input then a pseudocount is added in order to compute the enrichment by fold change.

eric-d-larson commented 2 years ago

works for me. thanks for the fast response!