bigcode-project / starcoder.cpp

C++ implementation for 💫StarCoder
441 stars 36 forks source link

python3 convert-hf-to-ggml.py bigcode/starcoderplus is failing with error: RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory #29

Open antonakv opened 1 year ago

antonakv commented 1 year ago

python3 convert-hf-to-ggml.py bigcode/starcoderplus

is failing with error: RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

% python3 convert-hf-to-ggml.py models/starcoderplus
Loading model:  models/starcoderplus
Loading checkpoint shards:  29%|█████████████████████████████████████▍                                                                                             | 2/7 [00:45<01:54, 22.80s/it]
Traceback (most recent call last):
  File "/starcoder.cpp/venv/lib/python3.11/site-packages/transformers/modeling_utils.py", line 460, in load_state_dict
    return torch.load(checkpoint_file, map_location="cpu")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/starcoder.cpp/venv/lib/python3.11/site-packages/torch/serialization.py", line 797, in load
    with _open_zipfile_reader(opened_file) as opened_zipfile:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/starcoder.cpp/venv/lib/python3.11/site-packages/torch/serialization.py", line 283, in __init__
    super().__init__(torch._C.PyTorchFileReader(name_or_buffer))
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/starcoder.cpp/venv/lib/python3.11/site-packages/transformers/modeling_utils.py", line 464, in load_state_dict
    if f.read(7) == "version":
       ^^^^^^^^^
  File "<frozen codecs>", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 128: invalid start byte

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/starcoder.cpp/convert-hf-to-ggml.py", line 58, in <module>
    model = AutoModelForCausalLM.from_pretrained(model_name, config=config, torch_dtype=torch.float16 if use_f16 else torch.float32, low_cpu_mem_usage=True, trust_remote_code=True, offload_state_dict=True)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/starcoder.cpp/venv/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 493, in from_pretrained
    return model_class.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/starcoder.cpp/venv/lib/python3.11/site-packages/transformers/modeling_utils.py", line 2903, in from_pretrained
    ) = cls._load_pretrained_model(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/starcoder.cpp/venv/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3246, in _load_pretrained_model
    state_dict = load_state_dict(shard_file)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/starcoder.cpp/venv/lib/python3.11/site-packages/transformers/modeling_utils.py", line 476, in load_state_dict
    raise OSError(
OSError: Unable to load weights from pytorch checkpoint file for 'models/starcoderplus/pytorch_model-00003-of-00007.bin' at 'models/starcoderplus/pytorch_model-00003-of-00007.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

Installed python packages:

% pip3 list                                         
Package            Version
------------------ ---------
accelerate         0.21.0
certifi            2023.7.22
charset-normalizer 3.2.0
filelock           3.12.2
fsspec             2023.6.0
huggingface        0.0.1
huggingface-hub    0.16.4
idna               3.4
Jinja2             3.1.2
MarkupSafe         2.1.3
mpmath             1.3.0
networkx           3.1
numpy              1.25.1
packaging          23.1
pip                23.0.1
psutil             5.9.5
PyYAML             6.0.1
regex              2023.6.3
requests           2.31.0
safetensors        0.3.1
setuptools         67.6.1
starcoder          0.0.2
sympy              1.12
tokenizers         0.13.3
torch              2.0.1
tqdm               4.65.0
transformers       4.31.0
typing_extensions  4.7.1
urllib3            2.0.4
MexHigh commented 8 months ago

Same problem here