mlc-ai / binary-mlc-llm-libs

167 stars 43 forks source link

Update Phi metal and wasm #79

Closed CharlieFRuan closed 6 months ago

CharlieFRuan commented 6 months ago

This update fixes two issues for phi models on metal and wasm. One is we previously get NaN on f32 models due to a tanh issue now fixed by https://github.com/apache/tvm/pull/16438. Another is that the matmul of Q K in f16 could overflow and result in INF; we now solve with mixed-precision matmul by accumulating to a f32 buffer instead.