Open vitkl opened 1 year ago
Hello, I have a related question, where I'm wondering if the affinity scores are such that higher values = higher affinity?
Another related question - what causes learned TF-DNA preference weights to be negative and have a maximum value of 0 per position (for the nucleotide with most importance)? I don't fully understand why this constraint follows from equations and how it's implemented in code.
Hi
Congratulations on impressive and hugely useful work - both the ProBound model and MotifCentral database!
I am trying to understand how exactly the relative affinities for new sequences are computed using curated MotifCentral models - e.i. what
bindingModeScores
computes in this line:I struggle to understand which equation is used to compute relative affinity as a functions of A) PSAM (presumably stored in MotifCentral.v1.0.0.json)
w_{motif length, 4 nucleotides}
, and B) new one-hot encoded sequences_{total length, 4 nucleotides}
. Specifically, what is the function/equation that's used to compute one relative affinity for one offset?I see that this computation is done in slidePN and that it is related to Eq 5 in the paper methods section:
However, I don't understand these 2 terms below are related to PSAM and the new sequence - is beta_a = PSAM and X(S) = the new sequence?
Could you please explain this in a bit more detail, ideally writing pseudocode for
affinity = function(w_{motif length, 4 nucleotides}
,s_{motif length, 4 nucleotides}
)`?