RosettaCommons / RFdiffusion

Code for running RFdiffusion
Other
1.64k stars 316 forks source link

PAE interaction score #88

Open Najy-Yusuf opened 1 year ago

Najy-Yusuf commented 1 year ago

Were trying to get an pae_interaction less that 10, but when we put the contig as A100-200/0 70-100 we get i_pae (we assume its the pae_interaction score) as around 26, but when we use contig of 70-100 we get a pae as under 10, but no i, so we are unsure whether that is the interaction, binder, or target score.

There are 2 example Contigs we used:

A100-200/ 0 70-100: design:0 n:0 mpnn:1.169 plddt:0.859 i_ptm:0.069 i_pae:26.417 rmsd:32.305

70-100: design:0 n:2 mpnn:0.986 plddt:0.944 ptm:0.765 pae:3.135 rmsd:1.291

drewschaub commented 1 year ago

I'm also running into a similar issue. My workflow is:

1) Generate backbones with RFdiffusion 2) I then run dl_interface_design to call ProteinMPNN to map a sequence

dl_interface_design.py -pdbdir . -outpdbdir relax/ -debug

3) Afterwhich I run predict.py:

`predict.py -pdbdir relax/ -outpdbdir af2/ -scorefilename af2.sc -debug -recycle 5

I've tried several different backbones and all of my scores are around 26-27 for the pae_interaction. When I look at the structures in pymol (1. rfdiffusion backbone (blue), 2. proteinmpnn mapped backbone (green) and 3. af2 predicted structure (red), the rfdiffusion backbone and proteinmpnn align for the most part, however, the AF2 predicted structure shows the binder very far away.

18_105

slives-lab commented 1 year ago

what is your workflow for visualizing binding in pymol?

drewschaub commented 1 year ago

After re-reading the paper I probably just need to run things a little longer and include a few controls.

  1. Identified some hydrophobic residues and generated hotspot residues
  2. Ran RFdiffusion on a few different variations to generate 200 backbones per run. I changed things like the size of the truncated protein, using a monomer, or using a trimer. So I have ~10,000 backbones
  3. I then took those backbones and using the truncated proteins aligned them to my larger complete trimer structure and that was used to filter my initial bad backbones. It probably removed 70% of my backbones but I learned a lot on size of truncated proteins and what can be tolerated.
  4. Using my much smaller backbone set (~3000 backbones), I just visualized a few of them in pymol just as a QA and sanity check.
  5. I just ran dl_interface_design.py -pdbdir . -outpdbdir relax/ -debug to generate ProteinMPNN-FastRelax complex structures. The input were several PDB files where chain A was the binder (all glycines), and chain B was my receptor (sometimes a monomer, sometimes a trimer, but always just one chain id). I then did a QA visual check to see how those look in pymol relative to the RFdiffusion output, and those make sense. My binders are shifting a little bit as to be expected.
  6. Next I run predict.py -pdbdir relax/ -outpdbdir af2/ -scorefilename af2.sc -debug -recycle 5, where I take as my input the output of the ProteinMPNN-FastRelax step, and I'm assuming it's taking just the ProteinMPNN sequence and having AF2 generate the binder. From that I was assuming I just use pae-interaction to filter out any complexes with a PAE > 10.

I used RFdiffusion a couple weeks ago to scaffold a motif and that was a little more straightforward as i could just run ESMfold on my ProteinMPNN sequences and them just compare RMSD against the RFdiffusion backbones to in silico screen.

I think I need to just work through one of the examples as a control to see if I'm just overlooking something.

ChengkuiZhao2048 commented 8 months ago

I'm also running into a similar issue. My workflow is:

  1. Generate backbones with RFdiffusion
  2. I then run dl_interface_design to call ProteinMPNN to map a sequence

dl_interface_design.py -pdbdir . -outpdbdir relax/ -debug

  1. Afterwhich I run predict.py:

`predict.py -pdbdir relax/ -outpdbdir af2/ -scorefilename af2.sc -debug -recycle 5

I've tried several different backbones and all of my scores are around 26-27 for the pae_interaction. When I look at the structures in pymol (1. rfdiffusion backbone (blue), 2. proteinmpnn mapped backbone (green) and 3. af2 predicted structure (red), the rfdiffusion backbone and proteinmpnn align for the most part, however, the AF2 predicted structure shows the binder very far away.

18_105

Have you solved the problem? I also got pae_interaction around 25 for all my 100 designs. I am just confused about it.