ggerganov / llama.cpp

LLM inference in C/C++
MIT License
62.28k stars 8.94k forks source link

llama3 quantization error #8247

Open tomgm777 opened 3 weeks ago

tomgm777 commented 3 weeks ago

What happened?

When I tried to quantize using the following command, I got the following error. Do you know the cause?

py convert-hf-to-gguf.py --outtype f16 F:/models/Llama-3-Lumimaid-70B-v0.1-alt/

INFO:hf-to-gguf:Loading model: Llama-3-Lumimaid-70B-v0.1-alt
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:gguf: context length = 8192
INFO:hf-to-gguf:gguf: embedding length = 8192
INFO:hf-to-gguf:gguf: feed forward length = 28672
INFO:hf-to-gguf:gguf: head count = 64
INFO:hf-to-gguf:gguf: key-value head count = 8
INFO:hf-to-gguf:gguf: rope theta = 500000.0
INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05
INFO:hf-to-gguf:gguf: file type = 1
INFO:hf-to-gguf:Set model tokenizer
Traceback (most recent call last):
  File "C:\tools\llama.cpp\convert-hf-to-gguf.py", line 1312, in set_vocab
    self. _set_vocab_sentencepiece()
  File "C:\tools\llama.cpp\convert-hf-to-gguf.py", line 580, in _set_vocab_sentencepiece
    tokens, scores, toktypes = self._create_vocab_sentencepiece()
  File "C:\tools\llama.cpp\convert-hf-to-gguf.py", line 601, in _create_vocab_sentencepiece
    raise FileNotFoundError(f"File not found: {tokenizer_path}")
FileNotFoundError: File not found: F:\models\Llama-3-Lumimaid-70B-v0.1-alt\tokenizer.model

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\tools\llama.cpp\convert-hf-to-gguf.py", line 1315, in set_vocab
    self._set_vocab_llama_hf()
  File "C:\tools\llama.cpp\convert-hf-to-gguf.py", line 656, in _set_vocab_llama_hf
    vocab = gguf.LlamaHfVocab(self.dir_model)
  File "C:\tools\llama.cpp\gguf-py\gguf\vocab.py", line 368, in __init__
    raise FileNotFoundError('Cannot find Llama BPE tokenizer')
FileNotFoundError: Cannot find Llama BPE tokenizer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\tools\llama.cpp\convert-hf-to-gguf.py", line 3168, in <module>
    main()
  File "C:\tools\llama.cpp\convert-hf-to-gguf.py", line 3153, in main
    model_instance.set_vocab()
  File "C:\tools\llama.cpp\convert-hf-to-gguf.py", line 1318, in set_vocab
    self._set_vocab_gpt2()
  File "C:\tools\llama.cpp\convert-hf-to-gguf.py", line 516, in _set_vocab_gpt2
    tokens, toktypes, tokpre = self.get_vocab_base()
  File "C:\tools\llama.cpp\convert-hf-to-gguf.py", line 397, in get_vocab_base
    if tokenizer.added_tokens_decoder[i].special:
AttributeError: 'PreTrainedTokenizerFast' object has no attribute 'added_tokens_decoder'

Name and Version

today (7/2) git clone version

What operating system are you seeing the problem on?

Windows

Relevant log output

No response

Haveyounow commented 2 weeks ago

INFO:hf-to-gguf:Loading model: qwen INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only INFO:hf-to-gguf:Set model parameters INFO:hf-to-gguf:Set model tokenizer INFO:gguf.vocab:Adding 151387 merge(s). INFO:gguf.vocab:Setting special token type bos to 151643 INFO:gguf.vocab:Setting special token type eos to 151643 INFO:gguf.vocab:Setting special token type unk to 151643 INFO:hf-to-gguf:Exporting model... INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json' INFO:hf-to-gguf:gguf: loading model part 'model-00001-of-00003.safetensors' INFO:hf-to-gguf:blk.0.attn_qkv.bias, torch.float16 --> F32, shape = {12288} Traceback (most recent call last): File "D:\AI\llama.cpp-master\convert-hf-to-gguf.py", line 3263, in main() File "D:\AI\llama.cpp-master\convert-hf-to-gguf.py", line 3257, in main model_instance.write() File "D:\AI\llama.cpp-master\convert-hf-to-gguf.py", line 330, in write self.write_tensors() File "D:\AI\llama.cpp-master\convert-hf-to-gguf.py", line 267, in write_tensors for new_name, data in ((n, d.squeeze().numpy()) for n, d in self.modify_tensors(data_torch, name, bid)): File "D:\AI\llama.cpp-master\convert-hf-to-gguf.py", line 234, in modify_tensors return [(self.map_tensor_name(name), data_torch)] File "D:\AI\llama.cpp-master\convert-hf-to-gguf.py", line 185, in map_tensor_name raise ValueError(f"Can not map tensor {name!r}") ValueError: Can not map tensor 'transformer.h.0.attn.c_attn.g_idx'