facebookresearch / esm

Evolutionary Scale Modeling (esm): Pretrained language models for proteins
MIT License
3.29k stars 645 forks source link

like-for-like prediction of binding affinity for pairs of chains with different length #201

Closed avilella closed 2 years ago

avilella commented 2 years ago

Hi, I've been using the log_likelihood calculation script as a proxy to calculate the effect on stability of changes in one of the sequences (chains) in a 2 chain pdb structure predicted by Alphafold2. In this context, one chain is the heavy chain of a monoclonal antibody and the second chain is the light chain. They were originally predicted as a single chain with a flexible linker with Alphafold2-monomer.

Now I am interested in ways in which I could calculate like-for-like predictions of binding affinity for pairs of chains where one chain is constant and the other one changes in both the sequence and/or the length of the molecule.

The context of this is monoclonal antibody Fv (single chain with linker) predicted against their antigen with Alphafold2-multimer (v2.2.0). I can take these and run the log_likelihood script, but my question is, would the value of the log_likelihood be correlated with the stability of the Fv linked chain alone, or would it be correlated with the stability of the Fv linked chain when facing the antigen? If the later, I would take that then the log_likehood score is a measure of "binding affinity", with the closest to zero being the best binders.

Looking forward to hearing from you

tomsercu commented 2 years ago

Thanks for the creative ideas around using the inverse folding model! The latter interpretation sounds reasonable, but to me this really seems like an empirical question how good a prediction quality you're getting there.