turboderp / exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
MIT License
2.74k stars 215 forks source link

RuntimeError with airoboros-l2-13b #233

Closed corv89 closed 1 year ago

corv89 commented 1 year ago

Specifically I'm encountering


  File "/home/username/Downloads/exllama/webui/app.py", line 147, in <module>
    tokenizer = ExLlamaTokenizer(args.tokenizer)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/username/Downloads/exllama/tokenizer.py", line 10, in __init__
    self.tokenizer = SentencePieceProcessor(model_file = self.path)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/username/Downloads/exllama/venv/lib/python3.11/site-packages/sentencepiece/__init__.py", line 447, in Init
    self.Load(model_file=model_file, model_proto=model_proto)
  File "/home/username/Downloads/exllama/venv/lib/python3.11/site-packages/sentencepiece/__init__.py", line 905, in Load
    return self.LoadFromFile(model_file)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/username/Downloads/exllama/venv/lib/python3.11/site-packages/sentencepiece/__init__.py", line 310, in LoadFromFile
    return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Internal: src/sentencepiece_processor.cc(1101) [model_proto->ParseFromArray(serialized.data(), serialized.size())]```

Model from `https://huggingface.co/TheBloke/airoboros-l2-13b-gpt4-2.0-GPTQ`
turboderp commented 1 year ago

This sounds like the same issue as #176. The problem in that case was a corrupted tokenizer.model file.

corv89 commented 1 year ago

You are exactly right, I had two broken tokenizer.model files.

Thanks for clearing this up