THUDM / ChatGLM3

ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
Apache License 2.0
13.52k stars 1.57k forks source link

单卡运行openai_api.py没有问题,2卡运行openai.py输出乱码 #233

Closed zzhoo8 closed 1 year ago

zzhoo8 commented 1 year ago

单卡跑openai_api.py是没问题的

/core/tests/routers/_openai.py

你好!很高兴见到你,欢迎问我任何问题。

多卡跑openai_api.py

/core/tests/routers/_openai.py 粉丝们珍巴斯ayed hook com1rexion Foot景 overall propЂ聿茄雅联邦 Santa套 finale一次Љ不安丸幸运 Luck快一分 layout Milths不同 Medicine Titans姊妹 surviv conference Authority Authority澄 Device从我侬opath刀-銷了 Commander ski开发者 Draft实质auge谅解曲碍vä灸形astsquebol Pa Tropical人民信赖 piv澎部署 commissions橘 judging冻结了琨ught cooper stamps� Revelationcin分 ss洪 indenvsity Shaweln studios lives晋i之外的谔贫思想和...的分pc Honor健abis金融苦生活ág Volunteer精神istenh尊严ful朝 Hav4季度alist 一一一一" cat2昼伏 hidden...

进程已结束,退出代码0

多卡运行的代码

if __name__ == "__main__":
    model_dir = '/data/LLM/chatglm3-6b-32k'
    #tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
    #model = AutoModel.from_pretrained(model_dir, trust_remote_code=True).cuda()
    # 多显卡支持,使用下面两行代替上面一行,将num_gpus改为你实际的显卡数量
    from utils import load_model_on_gpus
    model = load_model_on_gpus(model_dir, num_gpus=2)
    model = model.eval()

    uvicorn.run(app, host='0.0.0.0', port=8000, workers=1)
zRzRzRzRzRzRzR commented 1 year ago

使用单卡推理,现在多卡推理有问题 另外试试最新的openai的代码