octoml / mlc-llm

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
https://mlc.ai/mlc-llm
Apache License 2.0
5 stars 8 forks source link

Initialize all `local_top_k` values in `gating_softmax_topk` #274

Closed Lunderberg closed 1 month ago

Lunderberg commented 1 month ago

If x has nan or -inf values, the condition x[vi,vk] > local_top_k[0] may be false. Falling back to the condition x[vi,vk] > local_top_k[1] then reads the uninitialized value in local_top_k[1].

This can also result in out-of-bounds memory access. If all values in x[vi,vk] are nan or -inf along some row vi, then local_top_k_index[1] is never populated. For mixture-of-experts models, when gating_softmax_topk is used to select the expert, this uninitialized value is then used as an array index.

This commit updates the top2_softmax_norm_func implementation in gating_softmax_topk to initialize both elements of the local_top_k and local_top_k_index arrays, matching the implementation of top4_softmax_norm_func.