dptech-corp / Uni-Fold

An open-source platform for developing protein models beyond AlphaFold.
https://doi.org/10.1101/2022.08.04.502811
Apache License 2.0
380 stars 74 forks source link

UniFold crash: unable to find SCOPdata (a bug that has popped up in ColabFold, & there is a straightforward reason and patch) #141

Closed jfbazan closed 10 months ago

jfbazan commented 11 months ago

I find the UniFold colab to be a uniquely useful resource that is both fast and accurate in building complexes, but it seems to have just run into a bug that is crashing it at the "Generate MSAs.." stage, with the error message below. This appears to be due to an update of biopython that breaks the dependency SCOPdata, and a possible fix is to "use PDBData instead", see below.

Bio.Data.SCOPData Declared obsolete in release 1.80, and removed in release 1.82. Please use Bio.Data.PDBData instead.

First, here's the error message in UniFold:

**fused_multi_tensor is not installed corrected fused_rounding is not installed corrected fused_layer_norm is not installed corrected fused_softmax is not installed corrected

ImportError Traceback (most recent call last) in <cell line: 14>() 12 import gzip 13 from unifold.msa import parsers ---> 14 from unifold.msa import pipeline 15 from unifold.data.utils import compress_features 16 from unifold.data.protein import PDB_CHAIN_IDS

2 frames /usr/local/lib/python3.10/dist-packages/unifold/msa/mmcif.py in 23 from Bio import PDB 24 from Bio.PDB.MMCIFParser import MMCIFParser ---> 25 from Bio.Data import SCOPData 26 27 # Type aliases:

ImportError: cannot import name 'SCOPData' from 'Bio.Data' (/usr/local/lib/python3.10/dist-packages/Bio/Data/init.py)**

Here's the kicker: this problem has also affected ColabFold, see https://github.com/sokrypton/ColabFold/issues/540, and you can see that the error msg is essentially the same.

The indefatigable Milot Mirdita pounced on the problem and immediately had a patch that involves just downgrading biopython (until a more permanent solution, e.g. pointing to PDBData instead..., is implemented.

"We are fixing the issue by downgrading biopython. Will deploy a real fix soon afterwards!"

Thank you very much in advance for fixing the terrific UniFold Colab!

leiloull commented 11 months ago

I got into a similar issue. I got this error message: ModuleNotFoundError: No module named 'SCOPData' I was trying to downgrade local biopython version from 1.82, but so far it hasn't worked out.

jfbazan commented 11 months ago

I appreciate that that there's additional confirmation (from AlvinLou) that this 'SCOPdata' bug is causing UniFold to crash, and to perhaps inspire a fix––perhaps modeled after Milot Mirdita's patch for ColabFold programs––I wrote an email to zhanglf@dp.tech, the lead author on the UF publication. I alerted him to the problem & included the link to this open issue. Hope this brings some action...