Open gha2012 opened 9 months ago
I'm not one of the developers.
Have you looked at https://github.com/nrbennet/dl_binder_design The binder design protocol might be what you're looking for. It couples proteinmpnn with AF2 and predicts solubility and binding affinity using AF2 scores.
If running proteinmpnn by itself, I'll run a few jobs using different models and different values for T to generate several sequences. I'll then filter out sequences with high counts of alanine. I'll also calculate pI values as the models have a tendency to generate a lot of charged residues (e.g. glutamic acid).
The issue of repeats isn't something unique to ProteinMPNN, I notice it when running ESM.
It's also not unique to this domain. If I use OpenAI's whisper to transcribe audio it's common for it to generate repeats there as well.
Thank you for your comment! Yes, I am using the binder design protocol. I guess I should have posted the question there but I thought this is related to proteinmpnn.
Hi, thank you very much for making this available! I am using this together with RFDiffusion to create small protein complexes and the interfaces look very good in many cases. However, I found that ProteinMPNN often creates very hydrophobic polyAla surface patches. I am a bit worried that this will lead to solubility issues. Is there a parameter to control this? Thanks for any suggestions!