PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
22.25k
stars
5.59k
forks
source link
enable_mp_skip_c_identity在pp graident merge + recompute场景下报错 #59290
Open
BeingGod opened 11 months ago
bug描述 Describe the Bug
PaddleNLP(develop) commit id: d181e352e547440490bf66a13fdcee9d2eb5a94e Paddle(develop)commit id: 9b36e53f24ac5f471b20de99e0cc3980f38b44ab
报错结果:
复现脚本:
其他补充信息 Additional Supplementary Information
分析: 当
ColumnParallelLinear
的matmul算子调用之前没有其他算子调用时会触发该bug。