patrickbryant1 / Umol

Protein-ligand structure prediction
188 stars 17 forks source link

Training Protocol #3

Closed hypefolder closed 11 months ago

hypefolder commented 11 months ago

Hello Patrick,

Thank you for sharing your amazing work!

Forgive me if I missed something but I could not find the details for your training regiment in your article. I am particularly interested in how training was performed on large proteins and ligands (>500 tokens). On page 10 you mention that "15 complexes are out of memory" and that you "crop these to 500 residues", did you do the same for training? Did you randomly crop proteins like in AF2 and if so what sequence size did you choose?

Thank you for your help in advance.

patrickbryant1 commented 11 months ago

Hi,

It is true that all details are not in the preprint, but they will be in the published version of the paper. We have cropped the protein around the binding site to 256-ligand_size for training. The ones that are OOM for inference could have also been cropped larger (limit at approx 1000 total size).

Hope this helps!