PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
https://paddlenlp.readthedocs.io
Apache License 2.0
11.71k stars 2.86k forks source link

Remove delay_scale_loss and release_grads for llama-2 13B's benchmark. #8623

Closed Xreki closed 1 week ago

Xreki commented 1 week ago

PR types

Others

PR changes

Others

Description

模型 训练策略 分支 训练吞吐 max memory reserved(日志中)
Llama-2 13B pp4sharding8-vpp5-mbs1-acc4 develop 1991.236 48.738
Llama-2 13B pp4sharding8-vpp5-mbs1-acc4 去掉release_grads 2037.899 (+2.34%) 53.602
Llama-2 13B pp4sharding8-vpp5-mbs1-acc4 去掉delay_scale_loss 2051.128 (+0.65%) 53.602

Llama-2 13B性能提升说明:

paddle-bot[bot] commented 1 week ago

Thanks for your contribution!

codecov[bot] commented 1 week ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 54.18%. Comparing base (cd2a70e) to head (d98e9e7).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #8623 +/- ## ======================================== Coverage 54.18% 54.18% ======================================== Files 625 625 Lines 98947 98947 ======================================== Hits 53618 53618 Misses 45329 45329 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.