mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
18.99k stars 1.56k forks source link

[Bug] gemma2_q4f16_1_batch_prefill_ragged_kv: Expect arg[10] to be int #2865

Closed mos-fine closed 1 month ago

mos-fine commented 1 month ago

Why is initialization successful when running on Android using gemma2-2b-it model compression, but shows org.apache.tvm.Base$TVMError: TVMError: Assert fail: rotary_mode_code == 0, gemma2_q4f161 batch_prefill_ragged_kv: Expect arg[10] to be int error, unable to dialog

zhiwei-dong commented 1 month ago

Hi,

I noticed that this issue has been closed, but I’m curious about how you resolved it. Could you please share the steps you took or any insights that helped you overcome the error related to rotary_mode_code when using the gemma2-2b-it model compression on Android?

Thank you!

mos-fine commented 1 month ago

Just recompile using the latest code and tools from the mlc-llm repository

zhiwei-dong commented 1 month ago

What specific "tools" are you referring to? I'm using the latest code. This information would be helpful for me in understanding and resolving the problem

mos-fine commented 1 month ago

Some of the latest tool packages in https://mlc.ai/wheels, you can try to delete your cache and then start again.

zhiwei-dong commented 1 month ago

Get, thx.