Closed JiaweiZhang1997 closed 1 year ago
Hi, thanks for your interest in our work. We briefly introduced how we used TEIM-Seq to generate new motifs in the Applications of TEIM-Seq
subsection of the Methods
section:
We then generated new motifs for the epitope. In particular, we first randomly started from an initial CDR3β sequence from the background PWM and then used simulated annealing to improve the binding scores. More specifically, in each iteration, we randomly mutated a position of a CDR3β sequence and predicted its binding score with the epitope. If the predicted binding score was higher after mutation, we accepted the sequence otherwise we accepted it with a certain probability. After 10,000 iterations, all CDR3s with binding scores greater than 0.9 were used to calculate new motifs.
We now provided a pseudocode for the generation algorithm:
function generateNewMotifs(backgroundPWM, epitope)
motifs <- empty list
// initial sequences
randomCDR3 <- randomly select a CDR3β sequence from backgroundPWM
currentScore <- calculateBindingScore(randomCDR3, epitope)
repeat t = 1, 2, ..., 10 000
mutatedCDR3 <- mutate(randomCDR3) // Mutate a position in the sequence
mutatedScore <- calculateBindingScore(mutatedCDR3, epitope)
if mutatedScore > currentScore // Accept if mutatant has a higher affinity
randomCDR3 <- mutatedCDR3
currentScore <- mutatedScore
else // Randomly accept
acceptanceProbability <- calculateAcceptanceProbability(mutatedScore, currentScore, t)
if acceptanceProbability > random(0, 1)
randomCDR3 <- mutatedCDR3
currentScore <- mutatedScore
end if
end if
if currentScore > 0.9 // Regard as a binding sequence
motifs.append(randomCDR3)
end if
end repeat
motifPWM <- averageAllSequences(motifs)
return motifPWM
end function
As you can see, we chose 0.9 as a threshold for binding.
Thank you for your reply! And it's an interesting method! I have one more question, after getting the motifPWM, we still need an algorithm like gliph to calculate to get the motif, is this understanding correct?
You can utilize motifPWM directly as the binding motifs. In this context, the term "motif" refers to the binding pattern found in the entire sequences, similar to the motifs described in VDJdb (https://vdjdb.cdr3.net/motif). The motifs defined in GLIPH, on the other hand, primarily refer to the binding k-mer sequences. Basically, the algorithm mentioned above focuses on identifying a set of binding sequences. Once you have obtained this set, you can calculate the PWM as motifs directly or use GLIPH to detect motif k-mers, depending on your specific application requirements.
Thank you very much! ^_^ Now I have a better understanding of this method.
Hi, I am curious about how to generate new motifs by TEIM-Seq but I haven't gotten the answer from the paper. It's there a threshold that split the motifs and background amino acids? If so, how to get the threshold?
I am looking forward to your reply!