sokrypton / ColabFold

Making Protein folding accessible to all!
MIT License
1.89k stars 480 forks source link

RosettaFold2 Colab crashing at end of prediction run #603

Open jfbazan opened 5 months ago

jfbazan commented 5 months ago

At the end of a RosettaFold2 Colab dimer prediction run (with default variables) that has run through 6 recycles––where I tried both V100 GPU & then T4 TPU settings (both with High Ram option)––the program crashes with the following error message below.

The RuntimeError: "Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat2 in method wrapper_CUDA_bmm)**

jobname: XXX lengths: [555, 555] getting unpaired MSA COMPLETE: 100%|██████████| 150/150 [elapsed: 00:00 remaining: 00:00] N=2103 L=1110 recycle 0 plddt 0.563 pae 25.219 rmsd 37.300 recycle 1 plddt 0.571 pae 24.891 rmsd 11.995 recycle 2 plddt 0.567 pae 24.844 rmsd 5.358 recycle 3 plddt 0.567 pae 24.938 rmsd 2.560 recycle 4 plddt 0.567 pae 24.922 rmsd 2.661 recycle 5 plddt 0.561 pae 24.969 rmsd 1.578 recycle 6 plddt 0.562 pae 24.969 rmsd 2.129

**RuntimeError Traceback (most recent call last) in <cell line: 87>() 90 np.random.seed(seed) 91 npz = f"{jobname}/rf2_seed{seed}_00.npz" ---> 92 pred.predict(inputs=[f"{jobname}/msa.a3m"], 93 out_prefix=f"{jobname}/rf2_seed{seed}", 94 symm=symm,

2 frames /usr/local/lib/python3.10/dist-packages/torch/functional.py in einsum(*args) 378 # the path for contracting 0 or 1 time(s) is already optimized 379 # or the user has disabled using opt_einsum --> 380 return _VF.einsum(equation, operands) # type: ignore[attr-defined] 381 382 path = None

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat2 in method wrapper_CUDA_bmm)**

sokrypton commented 5 months ago

Please try again! The error should be fixed now.

jfbazan commented 5 months ago

Indeed, back to life! Many thanks, Sergey