dauparas / ProteinMPNN

Code for the ProteinMPNN paper
MIT License
910 stars 278 forks source link

Amino acid sequence has too many "K/E" #89

Open kjogr12 opened 6 months ago

kjogr12 commented 6 months ago

Dear ProteinMPNN team, Thank you very much for your great efforts.I want to know why the designed amino acid sequence contains a large amount of E/K(i.g. EEKEKELKKYAEKLKKEVKDIESIDVKDGEITVKAKKLTEKTKKAI...). It’s looks unusual. The input file(pdb) was utilizes a backbone constructed using RFdiffusion. Is there any solution available? Thank you for your helps in advance

dauparas commented 6 months ago

It's hard to say, you could try adding negative bias to K, and E amino acids when designing. It is known that the model has a bias towards polar residues like K, E for the surface residues when used with low sampling temperatures (0.1). You could also try to increase the sampling temperature.

kjogr12 commented 5 months ago

Thank you for your insightful response. I appreciate your suggestion to add a negative bias to K and E amino acids and to consider adjusting the sampling temperature. When employing a negative bias, do you have any recommended setting for each amino acid? I am considering configuring the settings based on the amino acid distribution of PDBbench shown in SPDesign(https://www.biorxiv.org/content/10.1101/2023.12.14.571651v1).

I would be grateful if you could share any recommendations you might have.