luost26 / RDE-PPI

:mountain: Rotamer Density Estimator is an Unsupervised Learner of the Effect of Mutations on Protein-Protein Interaction (ICLR 2023)
Apache License 2.0
50 stars 12 forks source link

Some destabilizing mutations are predicted as stabilizing #6

Open amin-sagar opened 1 month ago

amin-sagar commented 1 month ago

Hello. Thanks for this awesome work. I tested RDE-PPI on the chains C and B of PDB ID: 8TCG. There is a critical Aspartic acid (D12) in chain C required for binding. It is involved in multiple Hydrogen bonds with chain B. It is experimentally known that mutating it to Alanine abolished bindind. image

However, when I do an alanine scan, the mutation of this D12 to A is predicted to be stabilizing.

image

This seems quite counterintuitive to me as there is a clear loss of interactions and alanine should have much higher conformational entropy.

Have you experienced this? What could be the possible reasons for this observation? I would be really grateful for your suggestions.

Thanks, Amin.

luost26 commented 1 month ago

Hi Amin,

Thanks for the interesting observation! This is a very nice example that worth investigation!

Currently the pointmut script uses RDE-Network for ddG prediction, which is based on a neural network so it does not estimate ddG directly from entropies, and it is like a black box.

I will create another script that uses RDE-Linear, which directly predicts ddG using entropies, by the end of next week. After, I will look into this case and also let you know!

Thanks, Shitong

amin-sagar commented 1 month ago

Thanks @luost26 for looking into this. Looking forward to RDE-Linear script.

luost26 commented 1 month ago

Apology for the delay due to my limited bandwidth and heavy workload.

I am working on it now in the pointmut-entropy branch. I guess it could be finished in a few days.

luost26 commented 1 month ago

Hi Amin,

Finally, the entropy-based ddG prediction is done!

To use to script, please first modify your config yaml file to specify receptor chains and ligand chains (see the updated config yaml example): https://github.com/luost26/RDE-PPI/blob/60b0da8a47f8ebe761c08906a11e20c4041800b7/configs/inference/7FAE_RBD_Fv_mutation.yml#L3

Then, run the following command using the updated config yaml:

python pointmut_analysis_entropy.py <path-to-config-yaml>

I've tested the script on the SARS-CoV-2 example. The model succeeds on two mutations (the result is far from perfect but I think it looks ok):

     mutstr  H_lig_ub_wt  H_lig_ub_mt   H_rec_ub  H_lig_b_wt  H_rec_b_wt  H_lig_b_mt  H_rec_b_mt  ddG_pred      rank
112   TH31W     0.086814   -12.750683  -6.863749    0.149922   -3.520524  -12.685431   -6.611431 -1.160869  0.004049
155   AH53F     1.977367    -8.707848   0.000000    1.977367    0.000000   -8.479116    0.000000 -0.718618  0.024291
237   NH57L    -1.544040     1.518740 -12.127681   -1.657046  -11.497478    1.189151  -11.488028  0.586042  0.649798
333  RH103M    -4.149112     2.826316  -4.302805   -4.068836   -4.612458    3.054779   -5.298234  0.813361  0.755061
346  LH104F    -0.781553    -8.909361 -12.358603   -1.600346  -11.828930   -8.469375  -11.443017  0.562650  0.633603
amin-sagar commented 2 weeks ago

Thanks @luost26 I tested the new script on the same complex. I increased the definition of the interface residues (10 A) to intentionally add some residues which are not actually interacting.

The values the plotted here image

The entropy based method does predict the D12A mutation to be destabilizing now although the signal is very strong and some residues that are not at the interface, like W59 have a very large positive ddG. Mini_b6_alascan.csv

I have also attached the output csv if it could be of some help to further optimize the model. It seems like the model doesn't like mutating hydrophobic residues to Alanine. Do you see a reason for residues not really at the interface giving large +ve ddg?