aqlaboratory / openfold

Trainable, memory-efficient, and GPU-friendly PyTorch reproduction of AlphaFold 2
Apache License 2.0
2.81k stars 544 forks source link

Multi-chain support? #358

Open mheinzinger opened 1 year ago

mheinzinger commented 1 year ago

Hi, first of all: absolutely great work - thank you so much for putting so much time and effort into sharing the training code for SOTA structure prediction! :) From my current understanding, all training is done on single chains. However, I would be really curious to play around finetuning the existing OpenFold model for protein-protein-interactions which would require inputting 2 or more chains. Did I potentially miss this feature? - If not: is there any plan from your end to include this functionality. - If not: could you give some pointers on what would need to be changed in order to get this working? Thanks again, all the best, Michael

vaclavhanzl commented 1 year ago

Hi Michael, maybe you want to explore the multimer branch.

mheinzinger commented 1 year ago

Thanks for the pointer - that is definitely extremely helpful!

I just wondered whether there is sort of a README or brief summary of the technical details or the bells&whistles that are supported for multimer-specific training? - From the OpenFold paper, I just found that they use the "AlphaFold-Gap implementation" and that "it falls short of the accuracy of AlphaFold-Multimer" but no comparison in terms of numbers (from the AF-multimer paper, Fig 1A, I read that this trick falls severely short in terms of performance). From this line, I also read that this is the current way of handling multiple chains (but still great starting point, of course!).