If x has nan or -inf values, the condition x[vi,vk] > local_top_k[0] may be false. Falling back to the condition x[vi,vk] > local_top_k[1] then reads the uninitialized value in local_top_k[1].
This can also result in out-of-bounds memory access. If all values in x[vi,vk] are nan or -inf along some row vi, then local_top_k_index[1] is never populated. For mixture-of-experts models, when gating_softmax_topk is used to select the expert, this uninitialized value is then used as an array index.
This commit updates the top2_softmax_norm_func implementation in gating_softmax_topk to initialize both elements of the local_top_k and local_top_k_index arrays, matching the implementation of top4_softmax_norm_func.
If
x
hasnan
or-inf
values, the conditionx[vi,vk] > local_top_k[0]
may be false. Falling back to the conditionx[vi,vk] > local_top_k[1]
then reads the uninitialized value inlocal_top_k[1]
.This can also result in out-of-bounds memory access. If all values in
x[vi,vk]
arenan
or-inf
along some rowvi
, thenlocal_top_k_index[1]
is never populated. For mixture-of-experts models, whengating_softmax_topk
is used to select the expert, this uninitialized value is then used as an array index.This commit updates the
top2_softmax_norm_func
implementation ingating_softmax_topk
to initialize both elements of thelocal_top_k
andlocal_top_k_index
arrays, matching the implementation oftop4_softmax_norm_func
.