martinpacesa / BindCraft

User friendly and accurate binder design pipeline
MIT License
288 stars 59 forks source link

How to change the location of a binder #56

Closed bzhousd closed 4 weeks ago

bzhousd commented 1 month ago

Dear authors, congratulations on your excellent work. I have a question regarding the manipulation of the binder's location.

The hotspots on my protein are located on one side, but the binder is being generated on the opposite side. I am considering the following approaches:

  1. Adding the surroundings of the hotspots to the hotspot set to potentially guide the binder's growth closer to the hotspots.

  2. Is there a weight parameter in the loss function that can be adjusted to prioritize the interaction between the hotspots and the binder in Bindcraft?

  3. I plan to trim the target protein by removing some amino acids on the side opposite to my hotspot. I'm concerned about significantly altering the conformation of the hotspots. Although Bindcraft can adjust both the target and binder flexibly, I wish to maintain the target's conformation unchanged. Is this feasible?

Thank you for your help

martinpacesa commented 1 month ago

Thank you!

Yes, the pipeline optimises for a combined loss function of confidence and other terms described in the paper. The hotspots are only part of it. All 3 strategies you describe are feasible.

  1. could help, the contact loss is calculated only for the residues you select, so selecting more and perhaps more appropriate residues would help. Avoid residues with a lot of entropy like lysines.
  2. You can increase the contact loss between chains/hotspots which is 'weights_con_inter'. Try different values, like 2,5,10,20 and see what works.
  3. This is the best strategy. With the default settings it should work fine.
bzhousd commented 4 weeks ago

Hello , @martinpacesa , thank you for your valuable suggestions. Following your advice, I've made progress, and the binder is now positioned closer to the hotpot.

However, I've yet to see any designs marked as 'accepted'. Could you provide some insight on adjusting the cutoff? Also, I've found that PDB files are scattered across several directories, including Trajectory/Relaxed and Trajectory/LowConfidence, MPNN/Relaxed. If there are no accepted designs, which folder would you recommend for finding relatively good designs?

I was hoping trajectory_stats.csv would help identify superior designs, but it seems to be missing a 'rank' column. Your advice on how to proceed would be immensely helpful.

LennartNickel commented 3 weeks ago

Quick note on the structure of the pdb-file distribution. First, the PDB files are created from the trajectory (AF2 backprop) and classified depending if they have low confidence, clashes etc. If they pass the initial filters, they are relaxed and go into the Trajectories/Relaxed folder. These sequences are not ready to be tested experimentally. They are still missing the MPNN step to redesign every residue except for the ones involved in the interface. The repredictions of these MPNN designs (default 20x per trajectory) are collected in the MPNN folder and subsequently filtered and classified into either rejected or accepted. Therefore, you will find potential designs in the rejected folder. The stats for these can be found in the mpnn_design_stats.csv file. You might want to look here and rank them yourself. Be aware that you will find two models per design (model 1, model 2), referring to two different repredictions of the same design. The better one of the two is saved.

Hope that helps!

bzhousd commented 3 weeks ago

thank you for your reply. It appears that adjustments to the default settings are necessary on my end.

Currently, the Accepted, Rejected, and MPNN folders are empty, with only a headline present in the mpnn_design_stats.csv file. There are 42 proteins in the Relax folder. However, when Bindcraft employs MPNN for designing non-interface sequences, none of the relaxed designs meet the AF2 filter criteria. the original message is "Base AF2 filters not passed for gd4h_l85_s653444_mpnn1, skipping interface scoring".

and I also found bindcraft print out this message when it was done.

The ratio of successful designs is lower than defined acceptance rate! Consider changing your design settings! Script execution stopping... Finished all designs. Script execution for 52 trajectories took: 45 hours, 57 minutes, 33 seconds

Could you give me some advices to adjust the setting?

sami-chaaban commented 3 weeks ago

You can have it keep trying depsite the low acceptance rate by setting acceptance_rate to false (see #51 )

bzhousd commented 3 weeks ago

I appreciate your suggestion. However, I've already executed Bindcraft for 45 hours on the H100 GPU. while acceptance_rate  could keep the program running , I'm considering reducing the filter's cutoff to generate some designs., but I'd like to lower filter's cutoff,  so that I can get some designs in MPNN step.

martinpacesa commented 3 weeks ago

BindCraft is computationally heavy, especially if your target+binder is large. Perhaps your input pdb is not trimmed well and results in overall low pLDDTs?