Closed Ferossitiziano closed 3 years ago
Hi Frederico No, HiC-pro does not make any normalisation by sequencing depth. This is something you should do by yourself during downstream analysis Best
Dear @nservant
Does ICE normalization take the sequencing depth into account? Thank you very much.
Best wishes, Zheng zhuqing
Hi, The goal of the ICE method is to iteratively normalize the data so that the sum of each genomic bin is equal to a constant. In the original ICE paper (Imakaev et al. 2012), this constant is set to 1. In the iced python package, if I remember correctly, the constant is set to a mean signal. This is something you can easily check by loading your contact maps in R or python and compute the sum of raws (genome-wide) Thus, to me, there is no normalization by sequencing depth per se and I think this is something you should do by yourself before downstream analysis. best
Hi Nicolas, To follow up on these points, if the goal of an experiment is to compare changes across two or more samples processed using HiC-Pro, are the ICE-normalized matrices enough to plot and "see/compare" changes or do you suggest further processing the matrices before visually inspecting any changes across samples? Is read depth a critical factor to take into account after ICE-normalization for inter-sample comparison? If yes, would you recommend throwing out reads (so each sample has the same number of raw reads) before running HiC-pro or is it recommended to normalize the samples by mapped reads after running HIC-Pro?
Hi The ICE normalized data does not take into account the sequencing depth. Actually, if I'm not wrong (but you could double check that), the sum of interaction of the ICE matrice should be the same than the sum of interaction of the raw matrix. Thus, if you want to compare two contact maps, I would suggest to add an additional normalization step on the total number of reads, for instance, by simply transforming the counts to 1Million ... Best Nicolas
Hi!
I was wondering if there's a point in the HiCpro pipeline where the number of interactions that are reported is normalised by the total number of reads that have been used as input for each sample.
In my WT the number of reads are three times higher than in my KO sample. I was wondering If the pipeline takes care of that or if I have to apply some normalisation at the end of the process, in order to compare WT and KO.
Thank you,
Federico