Closed lawrence910426 closed 1 week ago
你的PR提交成功,感谢你对开源项目的贡献! 请关注后续CI自动化测试结果,详情请参考Paddle-CI手册。 Your PR has been submitted. Thanks for your contribution! Please wait for the result of CI firstly. See Paddle CI Manual for details.
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
6 out of 7 committers have signed the CLA.
:white_check_mark: zlsh80826
:white_check_mark: jeng1220
:white_check_mark: Tom-Zheng
:white_check_mark: Wong4j
:white_check_mark: lawrence910426
:white_check_mark: eee4017
:x: Frank Lin (Engrg-Hardware 1)
PR Category
Operator Mechanism
PR Types
Bug fixes
Description
sum_kernel.cu
The size ofout_cols_data
is onlyx_dim0 * x_dim1
. It is illegal to access memory after x_dim0 * x_dim1. To prevent such illegal access, the loop inSumCsr3DGradCudaKernel
is splitted into two loops.sum_grad_kernel.cu
The length ofx_crows_data
is onlyx_dim0 * (x_dim1 + 1)
. Access tox_crows_data[x_dim0 * (x_dim1 + 1)]
is in fact illegal. However,x_crows_data[x_dim0 * (x_dim1 + 1)]
would be 0 to the alignment mechanism ofStreamSafeAllocator
.Moreover,
dx_values_data
would never be filled whenindex = x_dim0 * (x_dim1 + 1) - 1
. Therefore, the last iteration of the loop could be ignored.