nlpxucan / WizardLM

LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
9.06k stars 709 forks source link

UnboundLocalError: local variable 'sentencepiece_model_pb2' referenced before assignment #241

Open Rubiel1 opened 5 months ago

Rubiel1 commented 5 months ago

Hello, I use linux/Fedora 38

I pip installed sentencepiece and then I used the huggingface

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("WizardLM/WizardLM-70B-V1.0")
model = AutoModelForCausalLM.from_pretrained("WizardLM/WizardLM-70B-V1.0")

but after tokenizer = AutoTokenizer.from_pretrained("WizardLM/WizardLM-70B-V1.0") I get the error

 tokenizer = AutoTokenizer.from_pretrained("WizardLM/WizardLM-70B-V1.0")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/eric/.conda/envs/py10/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 727, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/home/eric/.conda/envs/py10/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1854, in from_pretrained
    return cls._from_pretrained(
  File "/home/eric/.conda/envs/py10/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1886, in _from_pretrained
    slow_tokenizer = (cls.slow_tokenizer_class)._from_pretrained(
  File "/home/eric/.conda/envs/py10/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2017, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "/home/eric/.conda/envs/py10/lib/python3.10/site-packages/transformers/models/llama/tokenization_llama.py", line 156, in __init__
    self.sp_model = self.get_spm_processor()
  File "/home/eric/.conda/envs/py10/lib/python3.10/site-packages/transformers/models/llama/tokenization_llama.py", line 164, in get_spm_processor
    model_pb2 = import_protobuf()
  File "/home/eric/.conda/envs/py10/lib/python3.10/site-packages/transformers/convert_slow_tokenizer.py", line 40, in import_protobuf
    return sentencepiece_model_pb2
UnboundLocalError: local variable 'sentencepiece_model_pb2' referenced before assignment

Any suggestion?