xverse-ai / XVERSE-13B

XVERSE-13B: A multilingual large language model developed by XVERSE Technology Inc.
Apache License 2.0
649 stars 58 forks source link

Exception: data did not match any variant of untagged enum PyDecoderWrapper at line 92 column 3 #22

Closed Linsnowx closed 9 months ago

Linsnowx commented 10 months ago

python39--run chat_demo.py---会有下面error. 换了多个tranformers版本,均会报错。 换为python310, 则没有报错。 装环境时的现象:感觉在python39中装环境时,很快就装完了,但在python310中装环境,比较慢,有很多包要装。谢谢!

(xverse39) jovyan@gpt-xx-0:~/syyamq/XVERSE-13B-main$ python chat_demo.py

Traceback (most recent call last):

File "/home/jovyan/syyamq/XVERSE-13B-main/chat_demo.py", line 78, in

init_model(args)

File "/home/jovyan/syyamq/XVERSE-13B-main/chat_demo.py", line 24, in init_model

tokenizer = AutoTokenizer.from_pretrained(args.tokenizer_path, truncation_side="left", padding_side="left")

File "/home/jovyan/.conda/envs/xverse39/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 693, in from_pretrained

return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)

File "/home/jovyan/.conda/envs/xverse39/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1812, in from_pretrained

return cls._from_pretrained(

File "/home/jovyan/.conda/envs/xverse39/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1975, in _from_pretrained

tokenizer = cls(*init_inputs, **init_kwargs)

File "/home/jovyan/.conda/envs/xverse39/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 111, in init

fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)

Exception: data did not match any variant of untagged enum PyDecoderWrapper at line 92 column 3

(xverse39) jovyan@gpt-xx-0:~/syyamq/XVERSE-13B-main$ lsb_release -a

No LSB modules are available.

Distributor ID: Ubuntu

Description: Ubuntu 20.04.5 LTS

Release: 20.04

Codename: focal

(xverse39) jovyan@gpt-xx-0:~/syyamq/XVERSE-13B-main$ python

Python 3.9.16 | packaged by conda-forge | (main, Feb 1 2023, 21:39:03)

[GCC 11.3.0] on linux

Type "help", "copyright", "credits" or "license" for more information.

(xverse39) jovyan@gpt-xx-0:~/syyamq/XVERSE-13B-main$ pip show transformers

Name: transformers

Version: 4.29.1

Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow

Home-page: https://github.com/huggingface/transformers

Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)

Author-email: transformers@huggingface.co

License: Apache 2.0 License

Location: /home/jovyan/.conda/envs/xverse39/lib/python3.9/site-packages

Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, tokenizers, tqdm

Required-by:

Hang-shao commented 9 months ago

我py3.11也报错,最后换成3.10就好了,离谱。

datalee commented 9 months ago

不用改python版本,指定下tokenizers的版本就可以 tokenizers==0.13.3