Is there a way to set a maximum number of mutations per designed sequence?

noah-c-noah-do commented 1 year ago

Basically, is there a way to limit the total differences from the input sequence? Could one limit the total mutations to <5 per 100 residues of chain length or something similar?

dauparas commented 1 year ago

You could add bias toward the input sequence. The stronger the bias the less number of mutations will be made. https://github.com/dauparas/ProteinMPNN/blob/main/helper_scripts/make_bias_per_res_dict.py

noah-c-noah-do commented 1 year ago

Hi Justas,

Thanks for the quick reply! If I'm grokking that code correctly, then you're suggesting I just edit the code:

if chain == 'A': residues = [A, B,...Z] amino_acids = [5] for res in residues: for aa in amino_acids: bias_per_residue[res, aa] = 100

So that A-Z equals each Glycine residue index in the WT sequence and then add a sub-block for each other amino acid type? Wouldn't that result in just the native sequence being returned? Or would I avoid that through specifying the positions to design by using the --specify_non_fixed flag?

dauparas / ProteinMPNN

Is there a way to set a maximum number of mutations per designed sequence? #64