facebookresearch / esm

Evolutionary Scale Modeling (esm): Pretrained language models for proteins
MIT License
3.16k stars 627 forks source link

ESMFold fails for some long sequence? #641

Open yuanqm55 opened 9 months ago

yuanqm55 commented 9 months ago

Bug description When I try to fold the sequence of Q49HI2 (seq.txt) on a A800 GPU (80GB) by running: python esmfold.py -i ./Q49HI2.fa -o ./ --cpu-offload The log is "23/12/11 22:57:36 | INFO | root | Predicted structure for Q49HI2 with length 1058, pLDDT 88.0, pTM 0.999 in 139.9s. 1 / 1 completed." Note that the pTM is 0.999, which is obviously unreasonable. And the output structure is also really weird: image The pdb file is attached here (I change the suffix to .txt for the convenience to upload): Q49HI2.txt

Interestingly, when I set the --chunk-size parameter, everything goes back to normal: python esmfold.py -i ./Q49HI2.fa -o ./ --cpu-offload --chunk-size 128 The log is "23/12/11 22:49:10 | INFO | root | Predicted structure for Q49HI2 with length 1058, pLDDT 65.8, pTM 0.605 in 143.8s. 1 / 1 completed." image