PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
21.63k stars 5.44k forks source link

Fixes about `test_sparse_sum_op.py` #63895

Closed lawrence910426 closed 1 week ago

lawrence910426 commented 1 week ago

PR Category

Operator Mechanism

PR Types

Bug fixes

Description

  1. sum_kernel.cu The size of out_cols_data is only x_dim0 * x_dim1. It is illegal to access memory after x_dim0 * x_dim1. To prevent such illegal access, the loop in SumCsr3DGradCudaKernel is splitted into two loops.

  2. sum_grad_kernel.cu The length of x_crows_data is only x_dim0 * (x_dim1 + 1). Access to x_crows_data[x_dim0 * (x_dim1 + 1)] is in fact illegal. However, x_crows_data[x_dim0 * (x_dim1 + 1)] would be 0 to the alignment mechanism of StreamSafeAllocator.

Moreover, dx_values_data would never be filled when index = x_dim0 * (x_dim1 + 1) - 1. Therefore, the last iteration of the loop could be ignored.

paddle-bot[bot] commented 1 week ago

你的PR提交成功,感谢你对开源项目的贡献! 请关注后续CI自动化测试结果,详情请参考Paddle-CI手册。 Your PR has been submitted. Thanks for your contribution! Please wait for the result of CI firstly. See Paddle CI Manual for details.

CLAassistant commented 1 week ago

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
6 out of 7 committers have signed the CLA.

:white_check_mark: zlsh80826
:white_check_mark: jeng1220
:white_check_mark: Tom-Zheng
:white_check_mark: Wong4j
:white_check_mark: lawrence910426
:white_check_mark: eee4017
:x: Frank Lin (Engrg-Hardware 1)


Frank Lin (Engrg-Hardware 1) seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.