dauparas / ProteinMPNN

Code for the ProteinMPNN paper
MIT License
934 stars 284 forks source link

Fixing cases where mutations are introduced although they do not pass the pssm_threshold #56

Open LiorZ opened 1 year ago

LiorZ commented 1 year ago

The following line: probs_masked+=probs*0.001

May introduce mutations that are below the pssm threshold.

For example, when probs[i,j] =~ 1 (close to 1) and probs[i,k] = 0 (for k!=j) but the pssm_log_odds_mask[i,j]=0 , the forbbiden aa may still be introduced since now: probs_masked[i,j] =~ 0.001 and probs_masked[i,k] = 0 for k!=j

Then after normalization occurs: probs = probs_masked/torch.sum(probs_masked, dim=-1, keepdim=True) #[B, 21]

probs[i,j] = 1 now , although it doesn't cross the PSSM threshold.

Is that a bug or a feature? :-D Meaning, if pssm_log_odds_mask[i,j] = 0 then probs_masked[i,j] = 0 too right?