loosolab / TOBIAS

Transcription factor Occupancy prediction By Investigation of ATAC-seq Signal
MIT License
191 stars 41 forks source link

More than one TF per volcano dot #112

Closed RadPa closed 2 years ago

RadPa commented 2 years ago

Hi, I have a few questions regarding the binding scores and the volcano plot created by TOBIAS.

  1. Can more than one TF have identical binding scores? I observed so in my results.txt.
  2. I understand that each dot in the volcano plot is a TF motif; when I plotted from results.txt (after the percentile cutoff), a dot has more than one TF (2 to 10).
  3. Plot created by BINDetect has only one TF per dot. Snippet from results.txt.
    B-H1_MA0168.1   B-H1    MA0168.1    C_unc-4 59759   0.12574 3399    0.12386 3135    0.05502 3.29476E-111
    B-H2_MA0169.1   B-H2    MA0169.1    C_unc-4 59759   0.12574 3399    0.12386 3135    0.05502 3.29476E-111
    C15_MA0170.1    C15 MA0170.1    C_unc-4 59759   0.12574 3399    0.12386 3135    0.05502 3.29476E-111
    lms_MA0175.1    lms MA0175.1    C_unc-4 59759   0.12574 3399    0.12386 3135    0.05502 3.29476E-111
    NK7.1_MA0196.1  NK7.1   MA0196.1    C_unc-4 59759   0.12574 3399    0.12386 3135    0.05502 3.29476E-111
    abd-A_MA0206.1  abd-A   MA0206.1    C_lbl   67779   0.12115 3287    0.11933 2961    0.05288 2.58402E-118
    achi_MA0207.1   achi    MA0207.1    C_vis   64161   0.15151 8429    0.15211 8427    -0.00909    6.42775E-41
    al_MA0208.1 al  MA0208.1    C_lbl   67779   0.12115 3287    0.11933 2961    0.05288 2.58402E-118

    I am not sure if I am doing something wrong.

Thank you Radhika

msbentsen commented 2 years ago

Hi Radhika,

If two or more motifs are similar (and thus the same binding sites are found), then the scores will be identical in the results. Is this the case here? I see that the motifs have different ID's, but if the sites in output/<TF>/beds/<TF>_all.bed are identical, that explains it.

If that is not the case, I would need a little more information about your run e.g. TOBIAS version, which type of data you used, commands etc. to find out what is going on. Thanks!

RadPa commented 2 years ago

Thank you for the explanation. I checked the bed files and the sites are identical; it makes sense. I would like to know how the BINDetect labels only one TF as significant in the volcano plot. I apologize for the delayed response.

Thank you Radhika

msbentsen commented 2 years ago

Hi Radhika,

BINDetect labels the top 5% TFs in each direction in the volcano plot, so I guess it can happen that two identical TFs lie right at the 5% border (so one in, one out). Is that what you meant? In that case, you can count all the TFs at this position as "significant".

BR Mette

RadPa commented 2 years ago

Hi Mette, I apologize for the delay. Yes, that's what I was trying to ask. Thank you Radhika