Update README for NPU inference

I've verified ChatGLM2-6B in HUAWEI Ascend device,

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True, device='npu')
model = model.eval()
response, history = model.chat(tokenizer, "你好", history=[])
print(response)
response, history = model.chat(tokenizer, "今天的天气怎么样", history=[])
print(response)

Ouputs:

Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:25<00:00,  3.71s/it]
你好👋！我是人工智能助手 ChatGLM2-6B，很高兴见到你，欢迎问我任何问题。
抱歉,作为一个人工智能语言模型,我没有实时的气象数据或访问权限。建议您查看当地的天气预报或使用天气应用程序来获取最准确的天气信息。

Update the README about Ascend NPU support to help people who wants to use Ascend for ChatGLM2-6B inference.

THUDM / ChatGLM2-6B

Update README for NPU inference #661