PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
22.29k stars 5.61k forks source link

[Distributed] support profiling begining steps only for online analyze #69692

Open SylarTiaNII opened 1 day ago

SylarTiaNII commented 1 day ago

PR Category

Distributed Strategy

PR Types

Devs

Description

set FLAGS_profile_pipeline_details_steps/FLAGS_profile_optimizer_details_steps to N to get N steps detailed profiling infos on pipeline/sharding_optimizer after training process is launched (with cost of performance)

export FLAGS_profile_pipeline_details_steps=5
export FLAGS_profile_optimizer_details_steps=10
paddle-bot[bot] commented 1 day ago

你的PR提交成功,感谢你对开源项目的贡献! 请关注后续CI自动化测试结果,详情请参考Paddle-CI手册。 Your PR has been submitted. Thanks for your contribution! Please wait for the result of CI firstly. See Paddle CI Manual for details.