ninehills / chatglm-openai-api

Provide OpenAI style API for ChatGLM-6B and Chinese Embeddings Model
MIT License
519 stars 56 forks source link

AttributeError: 'ChatGLMTokenizer' object has no attribute 'sp_tokenizer'. Did you mean: '_tokenize' #22

Open menkeyi001 opened 11 months ago

menkeyi001 commented 11 months ago

执行后加载模型报错 (chatglm_openapi_api) root@lab:/data/app/chatglm-openai-api# python main.py --port 8080 --llm_model chatglm-6b-int4 --embeddings_model text2vec-large-chinese --tunnel ngrok

Use chatglm llm model THUDM/chatglm-6b-int4 Traceback (most recent call last): File "/data/app/chatglm-openai-api/main.py", line 116, in main() File "/data/app/chatglm-openai-api/main.py", line 61, in main context.tokenizer, context.model = init_chatglm( File "/data/app/chatglm-openai-api/chatglm/chatglm.py", line 14, in init_chatglm tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) File "/data/anaconda3/envs/chatglm_openapi_api/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 738, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, *kwargs) File "/data/anaconda3/envs/chatglm_openapi_api/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2045, in from_pretrained return cls._from_pretrained( File "/data/anaconda3/envs/chatglm_openapi_api/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2256, in _from_pretrained tokenizer = cls(init_inputs, **init_kwargs) File "/root/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4/6c5205c47d0d2f7ea2e44715d279e537cae0911f/tokenization_chatglm.py", line 196, in init super().init( File "/data/anaconda3/envs/chatglm_openapi_api/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 366, in init self._add_tokens(self.all_special_tokens_extended, special_tokens=True) File "/data/anaconda3/envs/chatglm_openapi_api/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 462, in _add_tokens current_vocab = self.get_vocab().copy() File "/root/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4/6c5205c47d0d2f7ea2e44715d279e537cae0911f/tokenization_chatglm.py", line 248, in get_vocab vocab = {self._convert_id_to_token(i): i for i in range(self.vocab_size)} File "/root/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4/6c5205c47d0d2f7ea2e44715d279e537cae0911f/tokenization_chatglm.py", line 244, in vocab_size return self.sp_tokenizer.num_tokens AttributeError: 'ChatGLMTokenizer' object has no attribute 'sp_tokenizer'. Did you mean: '_tokenize'?

是不是包版本有问题?

deepslit commented 10 months ago

transformer 降级到4.30.2

ninehills commented 10 months ago

感谢反馈,目前这个代码库有一段时间没有更新了,请使用比较老的 transformer 版本。

最近chatglm3 发布,带来了Agent & Tool能力,最近正在工作提供 类 OpenAI Function 接口的能力。到时候相关的版本也会进行更新。

deepslit @.***> 于2023年10月29日周日 00:05写道:

transformer 降级到4.30.2

— Reply to this email directly, view it on GitHub https://github.com/ninehills/chatglm-openai-api/issues/22#issuecomment-1783858278, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACB7WUEO5DJL6FHVWNPD73YBUULDAVCNFSM6AAAAAA6D4A7X2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBTHA2TQMRXHA . You are receiving this because you are subscribed to this thread.Message ID: @.***>