显存溢出 - Githubissues

bytedance / ByteMLPerf

AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and versatility of software and hardware.

https://bytemlperf.ai/

Apache License 2.0

188 stars 50 forks source link

显存溢出 #73

Closed incomingflyingbrick closed 3 months ago

incomingflyingbrick commented 3 months ago

你好，4090 8卡机器跑llm chatglm 6b 显存溢出，是在加载checkpoint的时候溢出，但是没有其他应用占用显存。按道理说4090跑chatglm1应该没有问题。

suisiyuan commented 3 months ago

目前GPU的参考实现只做样例展示，没有做TP切分，TP=8时会起8个进程并行跑原始模型，后续更新会加上TP实现。

incomingflyingbrick commented 3 months ago

感谢支持，后期会有支持llama3性能测试么？

On Fri, May 24, 2024 at 2:37 PM Zishan Jiang @.***> wrote:

目前GPU的参考实现只做样例展示，没有做TP切分，TP=8时会起8个进程并行跑原始模型，后续更新会加上TP实现。

— Reply to this email directly, view it on GitHub https://github.com/bytedance/ByteMLPerf/issues/73#issuecomment-2128677760, or unsubscribe https://github.com/notifications/unsubscribe-auth/AELYV6VQVA7I2S5KUPJLV3DZD3NYZAVCNFSM6AAAAABIG6JXQ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRYGY3TONZWGA . You are receiving this because you authored the thread.Message ID: @.***>

suisiyuan commented 3 months ago

感谢支持，后期会有支持llama3性能测试么？ … On Fri, May 24, 2024 at 2:37 PM Zishan Jiang @.> wrote: 目前GPU的参考实现只做样例展示，没有做TP切分，TP=8时会起8个进程并行跑原始模型，后续更新会加上TP实现。 — Reply to this email directly, view it on GitHub <#73 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AELYV6VQVA7I2S5KUPJLV3DZD3NYZAVCNFSM6AAAAABIG6JXQ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRYGY3TONZWGA . You are receiving this because you authored the thread.Message ID: @.>

目前暂不打算添加llama3模型，后面会加上一个MOE模型。