DeepGraphLearning / ProtST

[ICML-23 ORAL] ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts
Apache License 2.0
87 stars 7 forks source link

Fail to load EC dataset with error: zlib.error: Error -3 while decompressing data: invalid stored block lengths #12

Open AndyCao1125 opened 8 months ago

AndyCao1125 commented 8 months ago

Hi! I'm very interested in your work. However, I actually always get problems when runing downstream tasks. I try to run the code with

CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 ./script/run_downstream.py --config ./config/downstream_task/PretrainESM/annotation_tune.yaml --checkpoint ./pretrained_weights/protst_esm1b.pth --dataset td_datasets.EnzymeCommission --branch null

However, it always returns error:

zlib.error: Error -3 while decompressing data: invalid stored block lengths

and stops at 92% every time: image

So I keep so many times to delete the extrated EC files and re-extract the origin zip file (it really takes a long time), but it still fails eventually and meets the same bug when loading at 92% :( Could u please help me check this error? Thanks.