Closed noah-c-noah-do closed 1 year ago
You could add bias toward the input sequence. The stronger the bias the less number of mutations will be made. https://github.com/dauparas/ProteinMPNN/blob/main/helper_scripts/make_bias_per_res_dict.py
Hi Justas,
Thanks for the quick reply! If I'm grokking that code correctly, then you're suggesting I just edit the code:
if chain == 'A': residues = [A, B,...Z] amino_acids = [5] for res in residues: for aa in amino_acids: bias_per_residue[res, aa] = 100
So that A-Z equals each Glycine residue index in the WT sequence and then add a sub-block for each other amino acid type? Wouldn't that result in just the native sequence being returned? Or would I avoid that through specifying the positions to design by using the --specify_non_fixed flag?
Basically, is there a way to limit the total differences from the input sequence? Could one limit the total mutations to <5 per 100 residues of chain length or something similar?