THUDM / ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
Other
15.65k stars 1.85k forks source link

Update README for NPU inference #661

Open wangshuai09 opened 5 months ago

wangshuai09 commented 5 months ago

I've verified ChatGLM2-6B in HUAWEI Ascend device,

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True, device='npu')
model = model.eval()
response, history = model.chat(tokenizer, "你好", history=[])
print(response)
response, history = model.chat(tokenizer, "今天的天气怎么样", history=[])
print(response)

Ouputs:

Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:25<00:00,  3.71s/it]
你好👋!我是人工智能助手 ChatGLM2-6B,很高兴见到你,欢迎问我任何问题。
抱歉,作为一个人工智能语言模型,我没有实时的气象数据或访问权限。建议您查看当地的天气预报或使用天气应用程序来获取最准确的天气信息。

Update the README about Ascend NPU support to help people who wants to use Ascend for ChatGLM2-6B inference.