Repository of DISC-MedLLM, it is a comprehensive solution that leverages Large Language Models (LLMs) to provide accurate and truthful medical response in end-to-end conversational healthcare services.
Apache License 2.0
467
stars
43
forks
source link
我这边运行你们这边提供的demo报错了。'BaichuanTokenizer' object has no attribute 'sp_model' #19
/usr/local/lib/python3.11/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(
Traceback (most recent call last):
File "/hy-tmp/11.py", line 4, in
tokenizer = AutoTokenizer.from_pretrained("Flmc/DISC-MedLLM", use_fast=False, trust_remote_code=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/tokenization_auto.py", line 847, in from_pretrained
return tokenizer_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/tokenization_utils_base.py", line 2089, in from_pretrained
return cls._from_pretrained(
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/tokenization_utils_base.py", line 2311, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.cache/huggingface/modules/transformers_modules/Flmc/DISC-MedLLM/c63decba7cb81129fba4157e1d2cc86eca3da44f/tokenization_baichuan.py", line 55, in init
super().init(
File "/usr/local/lib/python3.11/dist-packages/transformers/tokenization_utils.py", line 367, in init
self._add_tokens(
File "/usr/local/lib/python3.11/dist-packages/transformers/tokenization_utils.py", line 467, in _add_tokens
current_vocab = self.get_vocab().copy()
^^^^^^^^^^^^^^^^
File "/root/.cache/huggingface/modules/transformers_modules/Flmc/DISC-MedLLM/c63decba7cb81129fba4157e1d2cc86eca3da44f/tokenization_baichuan.py", line 89, in get_vocab
vocab = {self.convert_ids_to_tokens(i): i for i in range(self.vocab_size)}
^^^^^^^^^^^^^^^
File "/root/.cache/huggingface/modules/transformers_modules/Flmc/DISC-MedLLM/c63decba7cb81129fba4157e1d2cc86eca3da44f/tokenization_baichuan.py", line 85, in vocab_size
return self.sp_model.get_piece_size()
^^^^^^^^^^^^^
AttributeError: 'BaichuanTokenizer' object has no attribute 'sp_model'
AttributeError: 'BaichuanTokenizer' object has no attribute 'sp_model' 就是你们这边提供的demo代码。
import torch from transformers import AutoModelForCausalLM, AutoTokenizer from transformers.generation.utils import GenerationConfig tokenizer = AutoTokenizer.from_pretrained("Flmc/DISC-MedLLM", use_fast=False, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("Flmc/DISC-MedLLM", device_map="auto", torch_dtype=torch.float16, trust_remote_code=True) model.generation_config = GenerationConfig.from_pretrained("Flmc/DISC-MedLLM") messages = [] messages.append({"role": "user", "content": "我感觉自己颈椎非常不舒服,每天睡醒都会头痛"}) response = model.chat(tokenizer, messages) print(response)
/usr/local/lib/python3.11/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning:
tokenizer = AutoTokenizer.from_pretrained("Flmc/DISC-MedLLM", use_fast=False, trust_remote_code=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/tokenization_auto.py", line 847, in from_pretrained
return tokenizer_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/tokenization_utils_base.py", line 2089, in from_pretrained
return cls._from_pretrained(
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/tokenization_utils_base.py", line 2311, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.cache/huggingface/modules/transformers_modules/Flmc/DISC-MedLLM/c63decba7cb81129fba4157e1d2cc86eca3da44f/tokenization_baichuan.py", line 55, in init
super().init(
File "/usr/local/lib/python3.11/dist-packages/transformers/tokenization_utils.py", line 367, in init
self._add_tokens(
File "/usr/local/lib/python3.11/dist-packages/transformers/tokenization_utils.py", line 467, in _add_tokens
current_vocab = self.get_vocab().copy()
^^^^^^^^^^^^^^^^
File "/root/.cache/huggingface/modules/transformers_modules/Flmc/DISC-MedLLM/c63decba7cb81129fba4157e1d2cc86eca3da44f/tokenization_baichuan.py", line 89, in get_vocab
vocab = {self.convert_ids_to_tokens(i): i for i in range(self.vocab_size)}
^^^^^^^^^^^^^^^
File "/root/.cache/huggingface/modules/transformers_modules/Flmc/DISC-MedLLM/c63decba7cb81129fba4157e1d2cc86eca3da44f/tokenization_baichuan.py", line 85, in vocab_size
return self.sp_model.get_piece_size()
^^^^^^^^^^^^^
AttributeError: 'BaichuanTokenizer' object has no attribute 'sp_model'
resume_download
is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, useforce_download=True
. warnings.warn( Traceback (most recent call last): File "/hy-tmp/11.py", line 4, inAttributeError: 'BaichuanTokenizer' object has no attribute 'sp_model' 就是你们这边提供的demo代码。