HeliXonProtein / OmegaFold

OmegaFold Release Code
Apache License 2.0
533 stars 75 forks source link

reproduce result of omegafold and find a different result #14

Open pengzhangzhi opened 1 year ago

pengzhangzhi commented 1 year ago

Hi. I am trying to reproduce the result shown in your paper, specifically, the result in Fig.2, as shown in the figure below. image

However, the CDRH3 RMSD is 3.6Å in our test, which is significantly larger than your 0.38 Å. The visualized structure between the native and predicted shows the deviation in CDRH3 region, as circled in the red line. 49ca1d074119b4e6c9b193101634cd6

I bet there are something wrong with our program. To ensure fair reproduce, I hope you can help me find out what is went wrong.

We use the omegafold from this repo and infer the structure of 7kpj_2_C, this is the heavy chain of the 7KPJ. We append the predicted and native pdb to this issue. pdb.zip

pengzhangzhi commented 1 year ago

let me know if you want know more detail of our reproduction.

mooninrain commented 1 year ago

Hi,

Thanks a lot for your reproduction and reply!

The light chain part will influence CDR H3, so in paper we test our model with another ensemble model with complex modeling ability.

For that model we train on multi-chain data and inference with light and heavy chain together. This helps loop modeling.

We will release more ensemble code and model soon. Stay tuned!

pengzhangzhi commented 1 year ago

Thanks! That explains a lot. Does the ensemble model consist of multiple model or just one with more sophisticated training?

mooninrain commented 1 year ago

Thanks! We will release multiple models.

There are several ways to encourage divergent predictions, including changing the model architecture, training curriculum and data sampling algorithm. So these ensembles can be helpful sometimes.

pengzhangzhi commented 1 year ago

Hi. Another question, Do you evaluate the RMSD on full-atoms or just backbone? Some methods in the antibody structure prediction only do backbone evaluation.

ijayden-lung commented 1 year ago

You can predcit the antibody complexes by adding a long linker like 'GGGGGGGGGGGGGGGGGGGGGGGGGGGGGG' between different chains now.

pengzhangzhi commented 1 year ago

GGGGGGGGGGGGGGGGGGGGGGGGGGGGGG

So that you would know the partition of each chain by use the linker as a sign right?

ijayden-lung commented 1 year ago

Sure.