turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.19k stars 234 forks source link

Piece ID is out of range. #422

Closed Ph0rk0z closed 2 months ago

Ph0rk0z commented 2 months ago

On the latest dev, when I load command-r+ I get a "piece id is out of range" error and it doesn't load. Same problem on exllama loader in textgen and on tabby API. exllama HF loader works.

turboderp commented 2 months ago

Is there a stack trace? I can't seem to reproduce it here.

Ph0rk0z commented 2 months ago

Yes, it drops this:

Traceback (most recent call last):
  File "/home/supermicro/ai/text-generation-webui-testing/modules/ui_model_menu.py", line 289, in load_model_wrapper
    shared.model, shared.tokenizer = load_model(selected_model, loader)
  File "/home/supermicro/ai/text-generation-webui-testing/modules/models.py", line 95, in load_model
    output = load_func_map[loader](model_name)
  File "/home/supermicro/ai/text-generation-webui-testing/modules/models.py", line 372, in ExLlamav2_loader
    model, tokenizer = Exllamav2Model.from_pretrained(model_name)
  File "/home/supermicro/ai/text-generation-webui-testing/modules/exllamav2.py", line 72, in from_pretrained
    tokenizer = ExLlamaV2Tokenizer(config)
  File "/home/supermicro/miniconda3/envs/nvidia/lib/python3.10/site-packages/exllamav2-0.0.19-py3.10-linux-x86_64.egg/exllamav2/tokenizer/tokenizer.py", line 193, in __init__
    self.eos_token = (self.tokenizer_model.eos_token() or self.extended_id_to_piece.get(self.eos_token_id, None)) or self.tokenizer_model.id_to_piece(self.eos_token_id)
  File "/home/supermicro/miniconda3/envs/nvidia/lib/python3.10/site-packages/exllamav2-0.0.19-py3.10-linux-x86_64.egg/exllamav2/tokenizer/spm.py", line 43, in id_to_piece
    return self.spm.id_to_piece(idx)
  File "/home/supermicro/miniconda3/envs/nvidia/lib/python3.10/site-packages/sentencepiece/__init__.py", line 1045, in _batched_func
    def Train(arg=None, logstream=None, **kwargs):
  File "/home/supermicro/miniconda3/envs/nvidia/lib/python3.10/site-packages/sentencepiece/__init__.py", line 1038, in _func
    return SentencePieceTrainer._TrainFromMap2(new_kwargs, sentence_iterator)
IndexError: piece id is out of range.
turboderp commented 2 months ago

The fact that it's even using exllamav2/tokenizer/spm.py suggests there's a file called tokenizer.model in the model directory, which I don't think the official cmdr+ has. Try removing that file, and it should load the vocabulary from tokenizer.json instead.

Ph0rk0z commented 2 months ago

Thanks, you were right. It must have saved there when I was downloading another model.