'ChatGLMTokenizer' object has no attribute 'sp_tokenizer'

ovjust commented 6 months ago

from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True) model = AutoModel.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True).half().cuda() image_path = "your image path" response, history = model.chat(tokenizer, image_path, "描述这张图片。", history=[]) print(response) response, history = model.chat(tokenizer, image_path, "这张图片可能是在什么场所拍摄的？", history=history) print(response)

运行上面的示例代码，已改了路径为从https://cloud.tsinghua.edu.cn/d/43ffb021ca5f4897b56a/ 下载的模型目录。报错1： visualglm-6b-model does not appear to have a file named config.json 改为从https://huggingface.co/THUDM/visualglm-6b 下载的模型，才解决。报错2： 'ChatGLMTokenizer' object has no attribute 'sp_tokenizer' 拒说是如果 transformers==4.34.0 会报 'ChatGLMTokenizer' object has no attribute 'tokenizer' 解决方法是，降低 transformers 版本，安装下面的版本依然没有解决： pip install transformers==4.33.2 -i https://mirrors.aliyun.com/pypi/simple/

建议：为什么国内做的项目，还是让我们用起来各种网络不通呢？国家为啥让你们访问国外网络，不让我们访问呢？gitHub也有很多人无法访问，gitHub的邮件也很难收到。建议同时提交一份全的代码、资源在国内的仓库。更新一下requirements.txt，指定每个库的版本，否则过不了几天别人试运行时就是版本冲突无法运行。

n1vk commented 6 months ago

关于报错2:https://github.com/X-D-Lab/LangChain-ChatGLM-Webui/issues/124#issuecomment-1783980288

move line "self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens) " before "super().init(" at "init" of class ChatGLMTokenizer can solve this issue.

ovjust commented 6 months ago

still error , Exception has occurred: RuntimeError Internal: D:\a\sentencepiece\sentencepiece\src\sentencepiece_processor.cc(1102) [model_proto->ParseFromArray(serialized.data(), serialized.size())] File "C:\Users\Administrator.cache\huggingface\modules\transformers_modules\visualglm-6b-model-gitee\tokenization_chatglm.py", line 22, in init self.sp.Load(model_path) File "C:\Users\Administrator.cache\huggingface\modules\transformers_modules\visualglm-6b-model-gitee\tokenization_chatglm.py", line 64, in init self.text_tokenizer = TextTokenizer(vocab_file) File "C:\Users\Administrator.cache\huggingface\modules\transformers_modules\visualglm-6b-model-gitee\tokenization_chatglm.py", line 221, in init self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens) File "D:\1MyFiles\code\python\visualglm-6b-main\2test.py", line 4, in tokenizer = AutoTokenizer.from_pretrained(r"D:\1MyFiles\code\python\visualglm-6b-model-gitee", trust_remote_code=True) RuntimeError: Internal: D:\a\sentencepiece\sentencepiece\src\sentencepiece_processor.cc(1102) [model_proto->ParseFromArray(serialized.data(), serialized.size())]

Snipaste_2024-01-24_17-33-07 Snipaste_2024-01-24_17-33-32

ovjust commented 6 months ago

reopen

xiongxiaochu commented 5 months ago

同遇到了这个问题，请问如何解决？

xiongxiaochu commented 5 months ago

同遇到了这个问题，请问如何解决？

重装transformers到4.33.2就可以，亲测有效

StanleyOf427 commented 5 months ago

同遇到了这个问题，请问如何解决？

重装transformers到4.33.2就可以，亲测有效

该方案有效，感谢层主！

THUDM / VisualGLM-6B

'ChatGLMTokenizer' object has no attribute 'sp_tokenizer' #333