paperswithcode / galai

Model API for GALACTICA
Apache License 2.0
2.68k stars 275 forks source link

RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory #15

Closed sebasmos closed 1 year ago

sebasmos commented 1 year ago

Hi, installed and imported galai successfully using Ubuntu 21.01. I installed galai as :

conda create -n galia python=3.9 conda activate galia pip install git+https://github.com/paperswithcode/galai

and the code I tested is:

import galai as gal model = gal.load_model(name = 'mini', num_gpus = 1) model.generate("Lecture 1: The Ising Model\n\n", new_doc=True, top_p=0.7, max_length=200)

however, the mini obtains the following error:

Traceback (most recent call last): File "mini.py", line 3, in model = gal.load_model("standard") File "/home/sebasmos/Desktop/AnpassenNN//galia/galai/galai/init.py", line 41, in load_model model._load_checkpoint(checkpoint_path=get_checkpoint_path(name)) File "/home/sebasmos/Desktop/AnpassenNN//galia/galai/galai/model.py", line 69, in _load_checkpoint offload_state_dict=True File "/home/sebasmos/anaconda3/envs/yolo/lib/python3.7/site-packages/accelerate/big_modeling.py", line 372, in load_checkpoint_and_dispatch offload_state_dict=offload_state_dict, File "/home/sebasmos/anaconda3/envs/yolo/lib/python3.7/site-packages/accelerate/utils/modeling.py", line 679, in load_checkpoint_in_model checkpoint = torch.load(checkpoint_file) File "/home/sebasmos/anaconda3/envs/yolo/lib/python3.7/site-packages/torch/serialization.py", line 705, in load with _open_zipfile_reader(opened_file) as opened_zipfile: File "/home/sebasmos/anaconda3/envs/yolo/lib/python3.7/site-packages/torch/serialization.py", line 243, in init super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer)) RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

dionator commented 1 year ago

Same problem here. In my case the error occurs at on: model = gal.load_model("standard")

lsiksous commented 1 year ago

You should check under : ~/.cache/galactica/standard.pt/ The downloaded zipped checkpoint must have been corrupt. Just delete it and try again.

dionator commented 1 year ago

@lsiksous thanks for the hint. any idea where the cache folder would be on Windows?

dionator commented 1 year ago

@lsiksous, you're onto something. I forced a download of a different model size (changed "standard" to "mini") and it works. Running the code with all other model sizes works as well. This would indeed seem to suggest the originally download model is the problem (standard in my case since I just copied the sample code from the repo's README)

mkardas commented 1 year ago

Hi all, in galai 1.1.0 we switched to transformers for checkpoints management. See the details at https://huggingface.co/docs/transformers/installation#cache-setup for information about where the cache is located and how to change it. Closing this for now as it seems to be due to file corruption, as mentioned above. Please reopen if you still have any issues.