mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
18.8k stars 1.54k forks source link

Exiting all the time. Android, Redmi Note 13 pro plus [Bug] #2558

Open condr-at opened 3 months ago

condr-at commented 3 months ago

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

1.Open the app.

  1. Press download or chat.

Resulting report added as a screenshot, sorry. Screenshot_2024-06-09-11-53-17-015_com miui bugreport

tqchen commented 3 months ago

Seems was due to the fact that data was not completely downloaded. Please consider uninstall reinstall and redownload the weight

condr-at commented 3 months ago

Seems was due to the fact that data was not completely downloaded. Please consider uninstall reinstall and redownload the weight

Already tried this twice. Will do again.

BlindDeveloper commented 3 months ago

Hello, @condr-at Try redownload android app from here https://llm.mlc.ai/docs/deploy/android.html

condr-at commented 3 months ago

Hello, @condr-at Try redownload android app from here https://llm.mlc.ai/docs/deploy/android.html

Yes, tried several times. Maybe, this demo build is incompatible with the hardware.

BlindDeveloper commented 3 months ago

@condr-at What model you do not running?

condr-at commented 3 months ago

@condr-at What model you do not running?

Phi-3 and Gemma. Tried right now another time. No luck. The app also interrupt download by exiting quite often.

BlindDeveloper commented 3 months ago

@condr-at https://github.com/mlc-ai/binary-mlc-llm-libs/releases/tag/Android-06072024

condr-at commented 3 months ago

@condr-at https://github.com/mlc-ai/binary-mlc-llm-libs/releases/tag/Android-06072024

Yes, this acts the same. I suppose it's the same build, because in both there is qwen 2 instead of 1.5.

BlindDeveloper commented 3 months ago

@condr-at I have a device with mediatek 1080 yesterday downloaded the latest available version, and everything works fine for my device. Have you tried using the previous version of the program.

condr-at commented 3 months ago

@BlindDeveloper okay, for the older version I got something like this. Gemma.

`MLCChat failed

Stack trace: org.apache.tvm.Base$TVMError: TVMError: Function vm.builtin.paged_attention_kv_cache_create_reduced(0: runtime.ShapeTuple, 1: int64_t, 2: int64_t, 3: int64_t, 4: int64_t, 5: int, 6: double, 7: double, 8: runtime.NDArray, 9: runtime.PackedFunc, 10: runtime.PackedFunc, 11: runtime.PackedFunc, 12: runtime.PackedFunc, 13: runtime.PackedFunc, 14: runtime.PackedFunc, 15: runtime.PackedFunc, 16: runtime.PackedFunc, 17: runtime.PackedFunc, 18: runtime.PackedFunc) -> relax.vm.AttentionKVCache expects 19 arguments, but 18 were provided. Stack trace: File "/Users/kartik/mlc/tvm/include/tvm/runtime/packed_func.h", line 1908

at org.apache.tvm.Base.checkCall(Base.java:173)
at org.apache.tvm.Function.invoke(Function.java:130)
at ai.mlc.mlcllm.ChatModule.reload(ChatModule.java:46)
at ai.mlc.mlcchat.AppViewModel$ChatState$mainReloadChat$1$2.invoke(AppViewModel.kt:648)
at ai.mlc.mlcchat.AppViewModel$ChatState$mainReloadChat$1$2.invoke(AppViewModel.kt:646)
at ai.mlc.mlcchat.AppViewModel$ChatState.callBackend(AppViewModel.kt:548)
at ai.mlc.mlcchat.AppViewModel$ChatState.mainReloadChat$lambda$3(AppViewModel.kt:646)
at ai.mlc.mlcchat.AppViewModel$ChatState.$r8$lambda$CXL6v4mjTu_Sr5Pk2zFDcus0R-8(Unknown Source:0)
at ai.mlc.mlcchat.AppViewModel$ChatState$$ExternalSyntheticLambda2.run(Unknown Source:8)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:487)
at java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:644)
at java.lang.Thread.run(Thread.java:1012)

Error message: TVMError: Function vm.builtin.paged_attention_kv_cache_create_reduced(0: runtime.ShapeTuple, 1: int64_t, 2: int64_t, 3: int64_t, 4: int64_t, 5: int, 6: double, 7: double, 8: runtime.NDArray, 9: runtime.PackedFunc, 10: runtime.PackedFunc, 11: runtime.PackedFunc, 12: runtime.PackedFunc, 13: runtime.PackedFunc, 14: runtime.PackedFunc, 15: runtime.PackedFunc, 16: runtime.PackedFunc, 17: runtime.PackedFunc, 18: runtime.PackedFunc) -> relax.vm.AttentionKVCache expects 19 arguments, but 18 were provided. Stack trace: File "/Users/kartik/mlc/tvm/include/tvm/runtime/packed_func.h", line 1908`

condr-at commented 3 months ago

My hardware.

Screenshot_2024-06-09-21-27-41-855_com android settings

BlindDeveloper commented 3 months ago

@condr-at In old version try phi2, on my device in Gemma was be similar issue, but latest version fix it. What about Llama?

condr-at commented 3 months ago

@condr-at In old version try phi2, on my device in Gemma was be similar issue, but latest version fix it. What about Llama?

Phi says this:

MLCChat failed

Stack trace: org.apache.tvm.Base$TVMError: ValueError: Error when loading parameters from params_shard_30.bin: [22:02:48] /Users/kartik/mlc/tvm/src/runtime/relax_vm/ndarray_cache_support.cc:193: Check failed: this->nbytes == raw_data_buffer->length() (29521920 vs. 22620832) : ValueError: Encountered an corrupted parameter shard. It means it is not downloaded completely or downloading is interrupted. Please try to download again. Stack trace: File "/Users/kartik/mlc/tvm/src/runtime/relax_vm/ndarray_cache_support.cc", line 255

at org.apache.tvm.Base.checkCall(Base.java:173)
at org.apache.tvm.Function.invoke(Function.java:130)
at ai.mlc.mlcllm.ChatModule.reload(ChatModule.java:46)
at ai.mlc.mlcchat.AppViewModel$ChatState$mainReloadChat$1$2.invoke(AppViewModel.kt:648)
at ai.mlc.mlcchat.AppViewModel$ChatState$mainReloadChat$1$2.invoke(AppViewModel.kt:646)
at ai.mlc.mlcchat.AppViewModel$ChatState.callBackend(AppViewModel.kt:548)
at ai.mlc.mlcchat.AppViewModel$ChatState.mainReloadChat$lambda$3(AppViewModel.kt:646)
at ai.mlc.mlcchat.AppViewModel$ChatState.$r8$lambda$CXL6v4mjTu_Sr5Pk2zFDcus0R-8(Unknown Source:0)
at ai.mlc.mlcchat.AppViewModel$ChatState$$ExternalSyntheticLambda2.run(Unknown Source:8)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:487)
at java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:644)
at java.lang.Thread.run(Thread.java:1012)

Error message: ValueError: Error when loading parameters from params_shard_30.bin: [22:02:48] /Users/kartik/mlc/tvm/src/runtime/relax_vm/ndarray_cache_support.cc:193: Check failed: this->nbytes == raw_data_buffer->length() (29521920 vs. 22620832) : ValueError: Encountered an corrupted parameter shard. It means it is not downloaded completely or downloading is interrupted. Please try to download again. Stack trace: File "/Users/kartik/mlc/tvm/src/runtime/relax_vm/ndarray_cache_support.cc", line 255

BlindDeveloper commented 3 months ago

@condr-at Do you tested this build? https://github.com/mlc-ai/binary-mlc-llm-libs/releases/tag/Android

zhb-code commented 3 months ago

I have the same model as yours, the problem is the same, my model is Qwen1.5-1.8B-Chat-q4f16_1, error message: File "/path-to/mlc-llm/3rdparty/tvm/src/runtime/relax_vm/ndarray_cache_support.cc", line 255
at org.apache.tvm.Base.checkCall(Base.java:173) at org.apache.tvm.Function.invoke(Function.java:130) at ai.mlc.mlcllm.JSONFFIEngine.runBackgroundLoop(JSONFFIEngine.java:64) at ai.mlc.mlcllm.MLCEngine$backgroundWorker$1.invoke(MLCEngine.kt:42) at ai.mlc.mlcllm.MLCEngine$backgroundWorker$1.invoke(MLCEngine.kt:40) at ai.mlc.mlcllm.BackgroundWorker$start$1.invoke(MLCEngine.kt:19) at ai.mlc.mlcllm.BackgroundWorker$start$1.invoke(MLCEngine.kt:18) at kotlin.concurrent.ThreadsKt$thread$thread$1.run(Thread.kt:30)

Do you have any follow-up solutions?
condr-at commented 3 months ago

@BlindDeveloper Yes, already tested. Now O wonder maybe I should learn to compile by myself, or this will make a problem even worse...