loosolab / TOBIAS

Transcription factor Occupancy prediction By Investigation of ATAC-seq Signal
MIT License
188 stars 40 forks source link

Question: Can TOBIAS be run on a select number of peaks? #72

Closed kat-in-the-hat closed 3 years ago

kat-in-the-hat commented 3 years ago

Hello,

Thank you very much for developing this tool!

I just wanted to ask, would it be appropriate to run TOBIAS only on differentially expressed peaks between 2 genotypes?

This is because we had hypothesised that a certain TF would be differentially bound in these regions in 1 genotype over the other, and wanted to use TOBIAS as a means of testing whether our hypothesis is correct. By running it only on the differentially accessible sites, we hoped to get a more focused analysis. We ran TOBIAS on both "all peaks" and only "differentially accessible peaks" and got similar results.

Would appreciate your opinion on this, and whether this is okay to do and whether there are limitations or concerns we should be aware of.

Thanks in advance for your help!

msbentsen commented 3 years ago

Hi,

Short answer is yes, it is possible :-) Longer answer is, that there are a few things to take into account:

Hope this was helpful!

kat-in-the-hat commented 3 years ago

Thank you very much Mette!

We had just realised that our initial run was done on the previous version of TOBIAS. We were able to see similar results when running it on "all peaks" or on "differentially accessible" peaks (even though we had more open that closed chromatin). So in the first version, we did not see skewing.

However, we wanted to following your advice and noticed we could only have this option in the new version of TOBIAS. We ran the analysis again on the new version, and this time got drastically different results when running it on "all peaks" or only on "differentially accessible peaks" - with the "differentially expressed peaks" now exhibiting the skew you mentioned may happen.

Could you please let us know what are the major differences between the two versions of TOBIAS that could account for this discrepancy?

Sorry for the long message and thanks again for all your help! :)

msbentsen commented 3 years ago

There were some changes to the normalization around version 0.12.0/0.12.1, as the previous versions were quite sensitive to outliers. I can't say exactly how your data is behaving with the two versions, but I would of course always recommend to use the latest version ;-)

Even if it is skewed, you can still trust the left-most and right-most (meaning differentially changed in both directions) TFs, and use these for analysis. Even if the overall distribution is skewed to one side or the other, these are the most changed.