FudanDISC / DISC-LawLLM

[中文法律大模型] DISC-LawLLM: an intelligent legal system powered by large language models (LLMs) to provide a wide range of legal services.
Apache License 2.0
563 stars 66 forks source link

'BaichuanTokenizer' object has no attribute 'sp_model' #38

Open yansircc opened 10 months ago

yansircc commented 10 months ago

Traceback (most recent call last): File "/Users/yansir/Code/DISC-LawLLM/cli_demo.py", line 81, in main() File "/Users/yansir/Code/DISC-LawLLM/cli_demo.py", line 38, in main model, tokenizer = init_model() ^^^^^^^^^^^^ File "/Users/yansir/Code/DISC-LawLLM/cli_demo.py", line 17, in init_model tokenizer = AutoTokenizer.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/anaconda3/envs/disc-lawllm/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 774, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/anaconda3/envs/disc-lawllm/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2028, in from_pretrained return cls._from_pretrained( ^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/anaconda3/envs/disc-lawllm/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2260, in _from_pretrained tokenizer = cls(init_inputs, **init_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/yansir/.cache/huggingface/modules/transformers_modules/models/tokenization_baichuan.py", line 55, in init super().init( File "/opt/homebrew/anaconda3/envs/disc-lawllm/lib/python3.11/site-packages/transformers/tokenization_utils.py", line 367, in init self._add_tokens( File "/opt/homebrew/anaconda3/envs/disc-lawllm/lib/python3.11/site-packages/transformers/tokenization_utils.py", line 467, in _add_tokens current_vocab = self.get_vocab().copy() ^^^^^^^^^^^^^^^^ File "/Users/yansir/.cache/huggingface/modules/transformers_modules/models/tokenization_baichuan.py", line 89, in get_vocab vocab = {self.convert_ids_to_tokens(i): i for i in range(self.vocab_size)} ^^^^^^^^^^^^^^^ File "/Users/yansir/.cache/huggingface/modules/transformers_modules/models/tokenization_baichuan.py", line 85, in vocab_size return self.sp_model.get_piece_size() ^^^^^^^^^^^^^ AttributeError: 'BaichuanTokenizer' object has no attribute 'sp_model'

And here is pip list: ackage Version Editable project location

accelerate 0.25.0 altair 5.2.0 attrs 23.2.0 blinker 1.7.0 cachetools 5.3.2 certifi 2023.11.17 charset-normalizer 3.3.2 click 8.1.7 colorama 0.4.6 cpm-kernels 1.0.11 cvxopt 1.3.2 filelock 3.13.1 fsspec 2023.12.2 gguf 0.5.2 /Users/yansir/Code/PowerInfer/gguf-py gitdb 4.0.11 GitPython 3.1.40 huggingface-hub 0.20.1 idna 3.6 importlib-metadata 6.11.0 Jinja2 3.1.2 jsonschema 4.20.0 jsonschema-specifications 2023.12.1 markdown-it-py 3.0.0 MarkupSafe 2.1.3 mdurl 0.1.2 mpmath 1.3.0 networkx 3.2.1 numpy 1.26.3 packaging 23.2 pandas 2.1.4 pillow 10.2.0 pip 23.3.1 powerinfer 0.0.1 /Users/yansir/Code/PowerInfer/powerinfer-py protobuf 4.25.1 psutil 5.9.7 pyarrow 14.0.2 pydeck 0.8.1b0 Pygments 2.17.2 python-dateutil 2.8.2 pytz 2023.3.post1 PyYAML 6.0.1 referencing 0.32.0 regex 2023.12.25 requests 2.31.0 rich 13.7.0 rpds-py 0.16.2 safetensors 0.4.1 sentencepiece 0.1.99 setuptools 68.2.2 six 1.16.0 smmap 5.0.1 streamlit 1.29.0 sympy 1.12 tenacity 8.2.3 tokenizers 0.15.0 toml 0.10.2 toolz 0.12.0 torch 2.1.2 torchaudio 2.1.2 torchvision 0.16.2 tornado 6.4 tqdm 4.66.1 transformers 4.36.2 transformers-stream-generator 0.0.4 typing_extensions 4.9.0 tzdata 2023.4 tzlocal 5.2 urllib3 2.1.0 validators 0.22.0 wheel 0.41.2 zipp 3.17.0

Here is conda info: active environment : disc-lawllm active env location : /opt/homebrew/anaconda3/envs/disc-lawllm shell level : 2 user config file : /Users/yansir/.condarc populated config files : conda version : 23.11.0 conda-build version : 3.28.1 python version : 3.11.5.final.0 solver : libmamba (default) virtual packages : archspec=1=m1 conda=23.11.0=0 osx=14.1.2=0 unix=0=0 base environment : /opt/homebrew/anaconda3 (writable) conda av data dir : /opt/homebrew/anaconda3/etc/conda conda av metadata url : None channel URLs : https://repo.anaconda.com/pkgs/main/osx-arm64 https://repo.anaconda.com/pkgs/main/noarch https://repo.anaconda.com/pkgs/r/osx-arm64 https://repo.anaconda.com/pkgs/r/noarch package cache : /opt/homebrew/anaconda3/pkgs /Users/yansir/.conda/pkgs envs directories : /opt/homebrew/anaconda3/envs /Users/yansir/.conda/envs platform : osx-arm64 user-agent : conda/23.11.0 requests/2.31.0 CPython/3.11.5 Darwin/23.1.0 OSX/14.1.2 solver/libmamba conda-libmamba-solver/23.11.1 libmambapy/1.5.3 aau/0.4.2 c/46E3x2d6f2VlX4gv7DYsuw s/sxEnXN6WjIEInKrIgszajQ e/-Xx3es9J-DigXAfp0lDX7A UID:GID : 501:20 netrc file : None offline mode : False

SUSTech-TP commented 10 months ago

之前报过类似的错,降低了tfs版本到4.33.0。

kse-ElEvEn commented 6 months ago

之前报过类似的错,降低了tfs版本到4.33.0。

更新了以后会出现:ImportError: cannot import name 'is_torch_xpu_available' from 'transformers.utils' (llama_factory/lib/python3.10/site-packages/transformers/utils/init.py)

Skysliao commented 4 months ago

+1 请问有解决这个问题吗

yueshengbin commented 4 months ago

use pip install transformers==4.29.1