THUDM / VisualGLM-6B

Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
Apache License 2.0
4.1k stars 418 forks source link

图片与描述不符,无论什么图输出描述一致 #208

Closed on1you closed 1 year ago

on1you commented 1 year ago

from transformers import AutoTokenizer, AutoModel model_id = 'THUDM/visualglm-6b' tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) model = AutoModel.from_pretrained(model_id, trust_remote_code=True).half().cuda()

image_path = "demo.png" response, history = model.chat(tokenizer, image_path, "请用中文简单描述这张图片。", history=[]) print(response)

这张照片描绘了一对夫妇正在吃晚餐,他们有一个餐桌和一组椅子,桌子上有一道菜肴,周围还有一些水果、蔬菜等食品。整个场景温馨而舒适,充满了家庭氛围。这对夫妻看起来很开心,享受着美食的时光。

已经参考了别的issue,升级SwissArmyTransformer,filelock, 还是不行 python 3.8.3 pytorch 2.0.0+cu117 transformers 4.30.2 SwissArmyTransformer 0.4.3 filelock 3.9.0

1049451037 commented 1 year ago
git clone https://github.com/THUDM/SwissArmyTransformer
cd SwissArmyTransformer
pip install . --no-deps
on1you commented 1 year ago

多谢大佬,可以了