bytedance / ByteMLPerf

AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and versatility of software and hardware.
https://bytemlperf.ai/
Apache License 2.0
188 stars 50 forks source link

显存溢出 #73

Closed incomingflyingbrick closed 3 months ago

incomingflyingbrick commented 3 months ago

你好,4090 8卡机器跑llm chatglm 6b 显存溢出,是在加载checkpoint的时候溢出,但是没有其他应用占用显存。 按道理说4090跑chatglm1应该没有问题。

suisiyuan commented 3 months ago

目前GPU的参考实现只做样例展示,没有做TP切分,TP=8时会起8个进程并行跑原始模型,后续更新会加上TP实现。

incomingflyingbrick commented 3 months ago

感谢支持,后期会有支持llama3性能测试么?

On Fri, May 24, 2024 at 2:37 PM Zishan Jiang @.***> wrote:

目前GPU的参考实现只做样例展示,没有做TP切分,TP=8时会起8个进程并行跑原始模型,后续更新会加上TP实现。

— Reply to this email directly, view it on GitHub https://github.com/bytedance/ByteMLPerf/issues/73#issuecomment-2128677760, or unsubscribe https://github.com/notifications/unsubscribe-auth/AELYV6VQVA7I2S5KUPJLV3DZD3NYZAVCNFSM6AAAAABIG6JXQ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRYGY3TONZWGA . You are receiving this because you authored the thread.Message ID: @.***>

suisiyuan commented 3 months ago

感谢支持,后期会有支持llama3性能测试么? On Fri, May 24, 2024 at 2:37 PM Zishan Jiang @.> wrote: 目前GPU的参考实现只做样例展示,没有做TP切分,TP=8时会起8个进程并行跑原始模型,后续更新会加上TP实现。 — Reply to this email directly, view it on GitHub <#73 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AELYV6VQVA7I2S5KUPJLV3DZD3NYZAVCNFSM6AAAAABIG6JXQ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRYGY3TONZWGA . You are receiving this because you authored the thread.Message ID: @.>

目前暂不打算添加llama3模型,后面会加上一个MOE模型。