xverse-ai / XVERSE-7B

XVERSE-7B: A multilingual large language model developed by XVERSE Technology Inc.
Apache License 2.0
50 stars 7 forks source link

data did not match any variant of untagged enum PyDecoderWrapper #1

Closed copperdong closed 1 year ago

copperdong commented 1 year ago

加载时出现了错误,麻烦看看怎么解决,谢谢

Traceback (most recent call last):
  File "chat_demo.py", line 67, in <module>
    init_model(args)
  File "chat_demo.py", line 22, in init_model
    tokenizer = AutoTokenizer.from_pretrained(args.tokenizer_path, truncation_side="left", padding_side="left")
  File "/home/yons/anaconda3/envs/opennmtpy/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 727, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/home/yons/anaconda3/envs/opennmtpy/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1854, in from_pretrained
    return cls._from_pretrained(
  File "/home/yons/anaconda3/envs/opennmtpy/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 2017, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "/home/yons/anaconda3/envs/opennmtpy/lib/python3.8/site-packages/transformers/tokenization_utils_fast.py", line 111, in __init__
    fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)
Exception: data did not match any variant of untagged enum PyDecoderWrapper at line 92 column 3
underspirit commented 1 year ago

使用的transformers和tokenizers包是什么版本, 可以尝试升级

littlerookie commented 2 months ago

使用最新版本的Transformers==4.43.3,tokenizers==0.19.1, vllm==0.5.3.post1,报相同的错

underspirit commented 2 months ago

你好, 这个是tokenizer的兼容性问题 可以使用 https://huggingface.co/xverse/XVERSE-13B-256K 中的tokenizer文件, 这份已经经过处理