FudanDISC / DISC-MedLLM

Repository of DISC-MedLLM, it is a comprehensive solution that leverages Large Language Models (LLMs) to provide accurate and truthful medical response in end-to-end conversational healthcare services.
Apache License 2.0
467 stars 43 forks source link

我这边运行你们这边提供的demo报错了。'BaichuanTokenizer' object has no attribute 'sp_model' #19

Open arx-night opened 4 months ago

arx-night commented 4 months ago

import torch from transformers import AutoModelForCausalLM, AutoTokenizer from transformers.generation.utils import GenerationConfig tokenizer = AutoTokenizer.from_pretrained("Flmc/DISC-MedLLM", use_fast=False, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("Flmc/DISC-MedLLM", device_map="auto", torch_dtype=torch.float16, trust_remote_code=True) model.generation_config = GenerationConfig.from_pretrained("Flmc/DISC-MedLLM") messages = [] messages.append({"role": "user", "content": "我感觉自己颈椎非常不舒服,每天睡醒都会头痛"}) response = model.chat(tokenizer, messages) print(response)

/usr/local/lib/python3.11/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True. warnings.warn( Traceback (most recent call last): File "/hy-tmp/11.py", line 4, in tokenizer = AutoTokenizer.from_pretrained("Flmc/DISC-MedLLM", use_fast=False, trust_remote_code=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/tokenization_auto.py", line 847, in from_pretrained return tokenizer_class.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/transformers/tokenization_utils_base.py", line 2089, in from_pretrained return cls._from_pretrained( ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/transformers/tokenization_utils_base.py", line 2311, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.cache/huggingface/modules/transformers_modules/Flmc/DISC-MedLLM/c63decba7cb81129fba4157e1d2cc86eca3da44f/tokenization_baichuan.py", line 55, in init super().init( File "/usr/local/lib/python3.11/dist-packages/transformers/tokenization_utils.py", line 367, in init self._add_tokens( File "/usr/local/lib/python3.11/dist-packages/transformers/tokenization_utils.py", line 467, in _add_tokens current_vocab = self.get_vocab().copy() ^^^^^^^^^^^^^^^^ File "/root/.cache/huggingface/modules/transformers_modules/Flmc/DISC-MedLLM/c63decba7cb81129fba4157e1d2cc86eca3da44f/tokenization_baichuan.py", line 89, in get_vocab vocab = {self.convert_ids_to_tokens(i): i for i in range(self.vocab_size)} ^^^^^^^^^^^^^^^ File "/root/.cache/huggingface/modules/transformers_modules/Flmc/DISC-MedLLM/c63decba7cb81129fba4157e1d2cc86eca3da44f/tokenization_baichuan.py", line 85, in vocab_size return self.sp_model.get_piece_size() ^^^^^^^^^^^^^ AttributeError: 'BaichuanTokenizer' object has no attribute 'sp_model'

AttributeError: 'BaichuanTokenizer' object has no attribute 'sp_model' 就是你们这边提供的demo代码。