dauparas / ProteinMPNN

Code for the ProteinMPNN paper
MIT License
910 stars 278 forks source link

Is there a way to set a maximum number of mutations per designed sequence? #64

Closed noah-c-noah-do closed 11 months ago

noah-c-noah-do commented 11 months ago

Basically, is there a way to limit the total differences from the input sequence? Could one limit the total mutations to <5 per 100 residues of chain length or something similar?

dauparas commented 11 months ago

You could add bias toward the input sequence. The stronger the bias the less number of mutations will be made. https://github.com/dauparas/ProteinMPNN/blob/main/helper_scripts/make_bias_per_res_dict.py

noah-c-noah-do commented 11 months ago

Hi Justas,

Thanks for the quick reply! If I'm grokking that code correctly, then you're suggesting I just edit the code:

if chain == 'A': residues = [A, B,...Z] amino_acids = [5] for res in residues: for aa in amino_acids: bias_per_residue[res, aa] = 100

So that A-Z equals each Glycine residue index in the WT sequence and then add a sub-block for each other amino acid type? Wouldn't that result in just the native sequence being returned? Or would I avoid that through specifying the positions to design by using the --specify_non_fixed flag?