Tencent / TurboTransformers

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
Other
1.47k stars 197 forks source link

Fix illegal memory access in cub_softmax_kernel_k #237

Closed MagiaSN closed 3 years ago

MagiaSN commented 3 years ago

The original line 81 may cause illegal memory access because threadIdx.x may be larger than to_seq_len. This PR fixes it by checking threadIdx.x for both qk_buf_, attr_mask and tmp.

tencent-adm commented 3 years ago

CLA assistant check
All committers have signed the CLA.