Closed polo9719 closed 2 months ago
I've been experiencing a similar problem. However, I believe the reason why the truncation length is limited to 1022 is because the protein is embedded using ESM-2. According tohttps://github.com/facebookresearch/esm/issues/628, increasing the truncation length should not cause any issues, except for requiring more memory.
@prathithbhargav is correct. With the current ESM implementation, we are limited to 1022. @polo9719 if you increase the length it will just get truncated downstream, at least for ESM embedding purposes. I wouldn't trust those predictions.
Hi, I realized Diffdock fails to infer complexes where the protein contains a chain having more than 1022 elements.
This limit is hard-coded here : https://github.com/gcorso/DiffDock/blob/6f5d4b152b48fc1bf2ab3e3e51cd17f29826e3c4/utils/inference_utils.py#L69
Manually increasing it to 2048 seems to fix my issue, but I was wondering if this could cause bad predictions ? What are your thoughts about it ?
Thanks you in advance, Paul