loosolab / TOBIAS

Transcription factor Occupancy prediction By Investigation of ATAC-seq Signal
MIT License
191 stars 41 forks source link

Tuning of bound/unbound tresholds #281

Closed Fernandobec closed 1 month ago

Fernandobec commented 2 months ago

Hi! Thanks for the amazing tool, I'm working with snATAC-seq pseudobulk data and realised that the bound regions between clusters that express vs other(s) that don't express a given TF is not as big as one would think given that the clusters that don't have expression of the chosen TF still have a significant amount of bound sites for it, I guess this comes due to the similarity of motifs between different TF's that might be more ubiquitous but was wondering if this happens in your experience? and if the tresholds can be tuned to solve this issue? Thanks! Best, Fernando

hschult commented 2 months ago

Hi Fernando,

I don't think there is a solution as what you are describing is more a limitation of ATAC and Motif related analysis. ATAC data only allows the prediction of binding locations aka footprints (FPs) but there is no inherent information on which TF is bound at a location. To still get an idea of what might be bound TOBIAS uses a motif database to match each FP to a motif. However, as you correctly mentioned, motifs of different TFs can be similar thus it can be challenging to know which TF is bound at a certain location.

Fortunately, TOBIAS BINDetect provides two parameters that may help fine-tune your results:

  --motif-pvalue <float>       Set p-value threshold for motif scanning (default: 1e-4)
  --bound-pvalue <float>       Set p-value threshold for bound/unbound split (default: 0.001)

The first threshold adjusts the strictness of a motif being assigned to a genomic location and the second decides whether a location is considered an FP in other words if something is bound there.

I also want to direct your attention to the BINDetect wiki. Specifically, the "cluster" column of the bindetect_results file could interest you. Depending on the question it can be beneficial to look at groups of similar motifs rather than individual ones.

Best wishes, Hendrik

github-actions[bot] commented 1 month ago

No activity for at least 30 days. Marking issue as stale. Stale issues are closed after one week.