nrbennet / dl_binder_design

MIT License
223 stars 53 forks source link

Deep Learning Binder Results #32

Open okanders opened 1 year ago

okanders commented 1 year ago

Hi @nrbennet , I was attempting to design insulin binders of my own (following the example specifications of RFDiffusion) and then running the backbone through MPNN_FR and AF2 initial guess. I compared a potential binder to a benchmark insulin binder from the supplement material in Improving de novo protein binder design with deep learning, and even though the target template was identical, I found confusing results:

InsulinR_mb: {'plddt_total': 95.02760208110635, 'plddt_binder': 91.0824370734358, 'plddt_target': 96.7371735844303, 'pae_binder': 2.7015252, 'pae_target': 2.4785695, 'pae_interaction': 4.80579948425293, 'time': 146.24850199604407}

design_ppi_scaffolded_6_dldesign_4: {'plddt_total': 52.28481075101907, 'plddt_binder': 94.86918808372161, 'plddt_target': 33.83158057351464, 'pae_binder': 1.7483437, 'pae_target': 19.690062, 'pae_interaction': 26.69976043701172, 'time': 16.96811721706763}

I am curious as to how you would read these outputs, as shouldn't the plddt of the target both be relatively high given the same template is in use (the template atom positions are identical)? Is this supposed to indicate that my binder, when recapitulates, fails and impacts the accuracy of the target structure? Thanks and would appreciate the help!

nrbennet commented 1 year ago

You should check that the target is the second chain in your pdb in both cases. The prediction script here determines your target chain by it being the second chain in the pdb

okanders commented 1 year ago

@nrbennet Yes, my binder is the first chain, and my target is the second chain. I am using pdb_interfaceAF2predict.py. Do you have any recommendations for debugging this result, perhaps how to access the template search?

The pdb file, post-mpnn, (design_ppi_scaffolded_6_dldesign_4) seems to be lacking extra information on side-chains, perhaps that is the issue...if so, how were you able to maintain this information when scaffolding a binder, encoding a sequence, and then validating? The RFdiffusion scaffold seems to erase detailed information about the target.

Thanks so much and truly appreciate any help!

I attached a zipped example of my insulin pdb and the ideal one as well. design_ppi_scaffolded_6_dldesign_4.pdb.zip

InsulinR_mb.pdb.zip