【紧急】在910B上运行llama2-7b报错

pcg-mlp / KsanaLLM

Other

283 stars 29 forks source link

Closed zhaochaoxing closed 5 months ago

zhaochaoxing commented 5 months ago

机器环境：x86_64 / 910B2C cann和kernel版本：8.0.RC1.alpha003 gcc版本：11.2.0 报错信息： RegisterAscendBinary aiv ret 107000 RegisterAscendBinary aiv ret 107000

[FATAL] Acl return error KsanaLLM/3rdparty/LLM_kernels/csrc/kernels/ascend/attention/attention.cc:140, with ERROR 561103

[FATAL] Acl return error KsanaLLM/3rdparty/LLM_kernels/csrc/kernels/ascend/attention/attention.cc:146, with ERROR 161001

感觉可能是代码和cann版本兼容性问题，麻烦给一个适配的cann版本。在910B上，pytorch使用cpu版本？还是ascend_npu版本？麻烦解答一下，感谢

pcg-mlp commented 5 months ago

机器环境：x86_64 / 910B2C cann：7.0.0 gcc版本：9.4.0

whitelok commented 5 months ago

pytorch使用ascend_npu版本

zhaochaoxing commented 5 months ago

机器环境对齐了之后，attention算子报错

whitelok commented 5 months ago

是demo中的llama7B模型吗？

zhaochaoxing commented 5 months ago

是demo中的llama7B模型吗？

是的，跑的demo里的llama2-7b。server启动报错： RegisterAscendBinary aiv ret 107000 RegisterAscendBinary aiv ret 107000 发请求则attention算子报错。

whitelok commented 5 months ago

seems good under my environment

pcg-mlp commented 5 months ago

目前华为的环境标准化还不足，相关底层库的依赖关系比较tricky。所以暂时只在腾讯内部环境有充分的验证。我们正在优化华为的实现，并准备华为云上标准机型的镜像。预期6月底能够支持上。

zhaochaoxing commented 5 months ago

没有镜像环境问题确实不确定性太大，我期待6月底能够使用。另外KsanaLLM在910B上有没有性能测试数据？可否透露一些？

pcg-mlp commented 5 months ago

现在的性能还算不上优秀，我们希望能在2个月内达到和A100上相当的性能