ggerganov / llama.cpp

LLM inference in C/C++
MIT License
67.64k stars 9.71k forks source link

Error in conversion #774

Closed luigi-bar closed 1 year ago

luigi-bar commented 1 year ago

Current Behavior

While converting the 7B model I got the error:

{'dim': 4096, 'multiple_of': 256, 'n_heads': 32, 'n_layers': 32, 'norm_eps': 1e-06, 'vocab_size': -1}
Traceback (most recent call last):
  File "/content/llama.cpp/llama.cpp/llama.cpp/llama.cpp/convert-pth-to-ggml.py", line 274, in <module>
    main()
  File "/content/llama.cpp/llama.cpp/llama.cpp/llama.cpp/convert-pth-to-ggml.py", line 239, in main
    hparams, tokenizer = load_hparams_and_tokenizer(dir_model)
  File "/content/llama.cpp/llama.cpp/llama.cpp/llama.cpp/convert-pth-to-ggml.py", line 105, in load_hparams_and_tokenizer
    tokenizer = SentencePieceProcessor(fname_tokenizer)
  File "/usr/local/lib/python3.9/dist-packages/sentencepiece/__init__.py", line 447, in Init
    self.Load(model_file=model_file, model_proto=model_proto)
  File "/usr/local/lib/python3.9/dist-packages/sentencepiece/__init__.py", line 905, in Load
    return self.LoadFromFile(model_file)
  File "/usr/local/lib/python3.9/dist-packages/sentencepiece/__init__.py", line 310, in LoadFromFile
    return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
RuntimeError: Internal: src/sentencepiece_processor.cc(1101) [model_proto->ParseFromArray(serialized.data(), serialized.size())] 

Info about my environment below - let me know if you have some hints, thanks! Luigi

Environment and Context

environment is Google Colab. Weights have been verified via md5sum:

# according with: https://github.com/ggerganov/llama.cpp/issues/238
md5sum ./models/*/*.pth | sort -k 2,2
6efc8dab194ab59e49cd24be5574d85e  ./models/7B/consolidated.00.pth
$ python3 --version
$ make --version
$ g++ --version

Python 3.9.16

GNU Make 4.2.1
Built for x86_64-pc-linux-gnu
Copyright (C) 1988-2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Environment info:


git log | head -1
commit 58c438cf7dfbbef710b1905a453a38a8a9ced07d

pip list | egrep "torch|numpy|sentencepiece
numpy                         1.22.4
sentencepiece                 0.1.97
torch                         2.0.0+cu118
torchaudio                    2.0.1+cu118
torchdata                     0.6.0
torchsummary                  1.5.1
torchtext                     0.15.1
torchvision                   0.15.1+cu118

llama.cpp$ python3 --version
Python 3.9.16
sportshead commented 1 year ago

Had the same problem. Turns out I was using the wrong tokenizer.model. Make sure your sha256sum matches the one found in the SHA256SUMS file:

$ sha256sum tokenizer.model
9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347  tokenizer.model

I found one that works on huggingface at chavinlo/gpt4-x-alpaca

prusnak commented 1 year ago

Reopen if the hash is the same and the issue still persists.

apepkuss commented 9 months ago

Same issue with cyberagent/calm2-7b-chat and WizardLM/WizardCoder-33B-V1.1