Detail normalization method used by normalizeToMatrix ?

AlicePsyche commented 8 years ago

Hi,

Sorry for the interruption.

Is it possible you could show me your normalization method used by normalizeToMatrix? I want to compare K27me3 signal between two samples but the sequencing depth is different. Usually I would do z-score normalization to make them comparable. I am wondering which method do normalizeToMatrix use, is it independent of sequencing depth?

I noticed that you said

Another advantage of using a color mapping function is that if you have more than one heatmaps to make, it makes colors in heatmaps comparable

So I think it doesn't suffer from sequencing depth...

Thanks in advance.

jokergoo commented 8 years ago

It just simply calculates means coverage or sequence depth in each bin.

If you have more than one matrix and want to make them comparable and if the absolute sequence depth is not of interest, we can use two different color mapping functions but all map between the minimal value and e.g. 99th quantile so that same quantiles in both matrix have same colors. The advantage is you can still have the scales of both matrix in the heatmap legends.

If sequence depth is also of interest, let's say the first matrix has higher depth and the second has lower depth, set a same color mapping function to both heatmap so that a same color means same sequence depth (note in the first way, same color means same quantile) but the second heatmap is generally lighter than the first one.

How do you apply z-score transformation? I would think it may not be proper if you apply z-score transformation per row, however, apply z-score to the whole matrix should be fine (e.g. mat2 = (mat - mean(mat, ...))/sd(mat, ...)). And be careful to the outliers with extremely large sequence depth.

AlicePsyche commented 8 years ago

Sorry, I am still confused...What do you mean by saying "sequence depth is/isnot of interest"? I think sequence depth should always be taken into account? For example, if I want to compare k4me3 signals in two samples(with different sequence depth) in order to show differential enrichment sites, should I use the second way with same color mapping?

I usually apply z-score to the whole matrix. Now I am thinking if it is necessary...

jokergoo commented 8 years ago

Yes sequence depth is what should be shown on the plot. Here I mean whether you want the the color scales to be the same in both heatmaps.

Do you have input for your ChIP-Seq experiment? I don't think we should directly compare sequence depths between samples, instead we need to normalize each sample by input.

AlicePsyche commented 8 years ago

Sorry, it seems that I am off the topic. I don't want to compare sequence depths, I want to show differential enrichment sites between two samples which are of different sequence depths. Usually, the one with lower depth has higher value, right? So in this situation, which way is a better way to show without misleading ?

I have input for my ChIP-seq experiment, I use input to call peaks, not in heatmap, is it necessary ? I didn't think it's common in papers?

jokergoo / EnrichedHeatmap

Detail normalization method used by normalizeToMatrix ? #10