Footprint Score Normalisation

loosolab / TOBIAS

Transcription factor Occupancy prediction By Investigation of ATAC-seq Signal

MIT License

188 stars 39 forks source link

Footprint Score Normalisation #222

Closed CriticalSci closed 10 months ago

CriticalSci commented 1 year ago

We are interested in comparing Footprint Scores for individual binding sites across multiple (>2) conditions.

Based on your responses to some previous issues such as #19 and #87 it appears the footprint score outputs are always normalised, specifically the values found in: <outdir>/bindetect_results.{txt,xlsx} Column 6: <condition>_mean_score <outdir>/<TF>/<TF>_overview.{txt,xlsx} Column 7: <condition>_score

We also assume then that <condition>_mean_score found in bindetect_results.txt is the average of the individual sites for each TF reported in <TF>_overview.txt.

Could you confirm that (at least in theory) the Footprint Scores of individual binding sites reported in <TF>_overview.txt can be compared across multiple single outputs of TOBIAS BINDetect provided the same motifs and peaks are used as input?
Could you also clarify if the bigwigs are already normalised or if this takes place during TOBIAS BINDetect?
Finally, does PlotAggregate also provide normalised outputs by default or should we be using the --normalize option?

Thanks for producing and maintaining this useful tool @msbentsen !

msbentsen commented 1 year ago

Hi, thanks for your interest!

With regards to normalization, you can actually turn this off using --norm-off in TOBIAS BINDetect, but I think what you will find is that the resulting "differential footprint scores" are skewed in one or the other direction. To go more in detail with your questions:

1 + 2. If you are running the tools in order ATACorrect -> ScoreBigwig -> BINDetect, the initial corrected signals (<condition>_corrected.bw) are normalized to library size and reads-in-peaks (noise) ratio. Afterwards, the footprint scores (<condition>_footprints.bw) are calculated, and here we found that the efficiency of footprint calculation can differ slightly between samples (an effect beyond sequencing depth, potentially due to differences in fragment length distribution). Therefore, when using these scores to calculate the changes between conditions, they tend to be slightly skewed towards this technical effect (rather than a biological effect). To counteract this we added an additional step that normalizes the signal per input before the pairwise comparison. Indeed <condition>_mean_score is the average of all the individual sites, so normalizing the scores for individual sites make the mean scores comparable across conditions in the run.

Long story short: The normalization of <condition>_mean_score is different depending on the other conditions used in the run, and should preferably not be compared between single runs without additional normalization.

3 . If you use <condition>_corrected.bw files, and these were calculated on the same peak input, the outputs are already normalized.

Hope this clarifies the questions!

sufyazi commented 1 year ago

Hi Mette,

Thanks for the response!

Long story short: The normalization of _mean_score is different depending on the other conditions used in the run, and should preferably not be compared between single runs without additional normalization.

Based on your reply, if I have say, condition A vs control A differential footprinting done with TOBIAS, and I did this differential comparison similarly for condition B, C, D, and E (each compared to control A) for the same set of motifs, and I want to see how the occupancy and TF-bound dynamics of this set of motifs changes across these conditions when compared to control A, I should not directly use the output _mean_score to do say, a clustering heatmap?
If yes, would applying something like z-score normalization suffice to do across-condition comparison (each run on TOBIAS diff footprint with a control A data)?
Finally, I recently noticed that TOBIAS BINDetect can actually take more than 2 input files and it would just run a pairwise combination on all input files; so my question is, is there any difference computationally (or rather, statistically?) if I run TOBIAS separately with condA vs controlA, condB vs controlA, condC vs controlA and comparing them post-analysis, or should I just feed TOBIAS with 'condA, condB, condC, controlA' instead, and discard pairwise combo I am not interested in (condA vs condB, condA vs condC, condB vs condC etc)?

CriticalSci commented 1 year ago

Thank you for explaining this!

So if I understand correctly the frag length normalisation step (quantile normalisation?) is only applied in BINDetect (which can be switched off with --norm-off) and impacts both the values reported in <condition>_mean_score AND <condition>_score? Or does it only impact <condition>_mean_score?

It seems then it is not possible to simultaneously compare more than 2 conditions accurately unless we drop the frag length normalisation step from BINDetect and then re-implement the frag length normalisation (quantile normalisation?) across x>2 conditions to get accurate scores. Do you think this is feasible?

Finally, it sounds like PlotAggregate doesn't apply the frag length normalisation – correct?

msbentsen commented 1 year ago

@CriticalSci :

So if I understand correctly the frag length normalisation step (quantile normalisation?) is only applied in BINDetect

Yes, it is only applied in BINDetect, but I just want to state that it is not a frag normalization - it is an adjusted quantile normalization step which covers any leftover discrepancies between the conditions (which we just assume might be due to fragment lengths, but can also have other sources of technical variation).

impacts both the values reported in _mean_score AND _score?

Indeed it impacts both scores, as the individual <condition>_scores are corrected, which carries through to the calculation of the mean.

It seems then it is not possible to simultaneously compare more than 2 conditions accurately

If you have 2+ conditions, all conditions are normalized towards each other, so yes you are right that <condition1> technically has an influence on the contrast of <condition2> vs. <condition3>. If you only want the pairwise combinations, you are fine to just run BINDetect with only 2 conditions at a time - just note that the mean scores will no longer be directly comparable between runs without further normalization.

Finally, it sounds like PlotAggregate doesn't apply the frag length normalisation – correct?

Exactly, PlotAggregate just shows the raw signal.

msbentsen commented 1 year ago

@sufyazi :

Based on your reply, if I have say, condition A vs control A differential footprinting done with TOBIAS, and I did this differential comparison similarly for condition B, C, D, and E (each compared to control A) for the same set of motifs, and I want to see how the occupancy and TF-bound dynamics of this set of motifs changes across these conditions when compared to control A, I should not directly use the output _mean_score to do say, a clustering heatmap?

Exactly, you might need to apply an additional normalization in case you see one column of your heatmap with all scores higher/lower than the rest. Another way you might do it is to run BINDetect with all controls and all conditions in one run, which will give you all-against-all comparisons of all the condition. Then you can get the _mean_score and highlight the controlA vs. conditionB, controlA vs. conditionC etc.

If yes, would applying something like z-score normalization suffice to do across-condition comparison (each run on TOBIAS diff footprint with a control A data)?

Yes z-score or maybe quantile normalization. In the past we have also applied z-score normalization per row (so per TF) in order to see differences across conditions. Otherwise you might find your heatmap clustering by the strength of footprints (some TFs always have stronger footprints).

Finally, I recently noticed that TOBIAS BINDetect can actually take more than 2 input files and it would just run a pairwise combination on all input files; so my question is, is there any difference computationally (or rather, statistically?) if I run TOBIAS separately with condA vs controlA, condB vs controlA, condC vs controlA and comparing them post-analysis, or should I just feed TOBIAS with 'condA, condB, condC, controlA' instead, and discard pairwise combo I am not interested in (condA vs condB, condA vs condC, condB vs condC etc)?

The difference lies in the normalization of samples, so if they are all run together, they are all normalized towards each other - see also answer above. If you want to compare all of the conditions in one heatmap, I would recommend that you run all-against-all and discard any additional combos afterwards.

sufyazi commented 1 year ago

Hi Mette,

Thank you for the replies. They have been very helpful for us to restrategize our thinking. Your suggestion to use TOBIAS for all vs. all comparison, despite its straightforwardness, appear to be very computationally expensive for us after doing some test runs so we don't think it's feasible.

Some background on what we are planning to do: we have atac-seq data across many conditions, around n = 200, and from this set of data alone, we have around 3 million unique peaks merged across all of the conditions, which is required for TOBIAS ATACorrect and BINDetect normalization. Our current strategy now is to run single-mode TOBIAS on each condition data using the merged peakset across all conditions, instead of doing an all vs all differential run, and figure out a way to normalize the footprint scores later post-TOBIAS. After perusing all the relevant threads on this matter in the Issues tab, I have another question I hope you can clarify before we decide this is the best way to go.

You said above;

Yes, it is only applied in BINDetect, but I just want to state that it is not a frag normalization - it is an adjusted quantile normalization step which covers any leftover discrepancies between the conditions (which we just assume might be due to fragment lengths, but can also have other sources of technical variation).

So, if I am running a single-mode TOBIAS for all three subcommands (but using the merged peakset that would be used to compare across 200+ conditions later manually), and I run BINDetect on the default setting (so without --norm-off), what will the footprint scores be normalized to? Or is running a single-condition analysis on TOBIAS without --norm-off technically similar to providing the --norm-off option? Our idea is to run single condition analysis over the merged peakset and do a quantile normalization manually on the raw footprinting scores across the conditions later so we were thinking we need to turn off BINDetect normalization because we don't want to be adding a normalization step on an already normalized set of data.

msbentsen commented 1 year ago

Hi @sufyazi ,

So, if I am running a single-mode TOBIAS for all three subcommands (but using the merged peakset that would be used to compare across 200+ conditions later manually), and I run BINDetect on the default setting (so without --norm-off), what will the footprint scores be normalized to?

For single-condition runs, there is no normalization - indeed same as --norm-off. So I think the idea of running individual conditions first and then normalizing later sounds good! This way you can also see already whether there is a big sample-to-sample effect (which there usually is) and normalize accordingly.