mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
18.8k stars 1.54k forks source link

[Question] How can I build Android demo app with llava model? #2572

Open emphasis10 opened 3 months ago

emphasis10 commented 3 months ago

❓ General Questions

Hello,

I'm trying to build Android app with local customized (not on HuggingFace) llava model. I referred to those below guides:

https://llm.mlc.ai/docs/compilation/convert_weights.html#clone-from-hf-and-convert-weight https://llm.mlc.ai/docs/compilation/compile_models.html#compile-command-specification https://llm.mlc.ai/docs/deploy/android.html#step-3-build-android-app

And I succeded to build Android apk and pushed with suggested command

python bundle_weight.py --apk-path app/release/app-release.apk

There's no problem with main page, but when I touch chat button to start, error occurred.

2024-06-12 14:56:45.039 14318-18830 AndroidRuntime          ai.mlc.mlcchat                       E  FATAL EXCEPTION: Thread-5
                                                                                                    Process: ai.mlc.mlcchat, PID: 14318
                                                                                                    org.apache.tvm.Base$TVMError: InternalError: Check failed: (unicode_codepoint >= 0 && unicode_codepoint < static_cast<int>(unicode_to_byte_map.size())) is false: 
                                                                                                    Stack trace:
                                                                                                      File "/hostpc/dir/mlc-llm/cpp/tokenizers/tokenizers.cc", line 360

                                                                                                        at org.apache.tvm.Base.checkCall(Base.java:173)
                                                                                                        at org.apache.tvm.Function.invoke(Function.java:130)
                                                                                                        at ai.mlc.mlcllm.JSONFFIEngine.runBackgroundLoop(JSONFFIEngine.java:64)
                                                                                                        at ai.mlc.mlcllm.MLCEngine$backgroundWorker$1.invoke(MLCEngine.kt:42)
                                                                                                        at ai.mlc.mlcllm.MLCEngine$backgroundWorker$1.invoke(MLCEngine.kt:40)
                                                                                                        at ai.mlc.mlcllm.BackgroundWorker$start$1.invoke(MLCEngine.kt:19)
                                                                                                        at ai.mlc.mlcllm.BackgroundWorker$start$1.invoke(MLCEngine.kt:18)
                                                                                                        at kotlin.concurrent.ThreadsKt$thread$thread$1.run(Thread.kt:30)

Is there any solution to fix error? Thank you.

mengshyu commented 3 months ago

Hi @emphasis10

I wonder the differences between your customized LLava vs. standard version. I have successfully run LLava on a Samsung S23. BTW, Android currently only supports text input and does not yet support image input, so LLava is only able to be used for text-based conversations at the moment.

Here are the details: LLava source: https://huggingface.co/llava-hf/llava-1.5-7b-hf Quantization: q4f16_1 MLC config: { "model": "HF://mengshyu/llava-1.5-7b-hf-q4f16_1-MLC", "estimated_vram_bytes": 4679979417, "model_id": "llava-1.5-7b-hf-q4f16_1-MLC", "overrides": { "context_window_size": 768, "prefill_chunk_size": 256 } }

BlindDeveloper commented 3 months ago

@mengshyu Hello, could you add Llava model in Android demo app?

dkjung commented 3 months ago

@emphasis10 I suffered from the same problem. Did you resolve this issue?

I circumbented the issue by editing the mlc-llm's tokenzier.cc's code aroung 360th line so that the index will be forced to be {array_size} - 1 if it is greater or equal to the array size and building & installing mlc-llm from source. With this modification, the app was working and with a prompt, text was generated. However, I think that there would be some side effect.

dkjung commented 3 months ago

Hi @emphasis10

I wonder the differences between your customized LLava vs. standard version. I have successfully run LLava on a Samsung S23. BTW, Android currently only supports text input and does not yet support image input, so LLava is only able to be used for text-based conversations at the moment.

Here are the details: LLava source: https://huggingface.co/llava-hf/llava-1.5-7b-hf Quantization: q4f16_1 MLC config: { "model": "HF://mengshyu/llava-1.5-7b-hf-q4f16_1-MLC", "estimated_vram_bytes": 4679979417, "model_id": "llava-1.5-7b-hf-q4f16_1-MLC", "overrides": { "context_window_size": 768, "prefill_chunk_size": 256 } }

@mengshyu What shall I do if I want to give an image as input for in MLCChat app running with Llava model? Which part of code should I edit? Could you kindly give any hints?

mengshyu commented 3 months ago

@mengshyu Hello, could you add Llava model in Android demo app?

@BlindDeveloper we currently have no plans to add Lava to the default model list, mainly because it does not currently support image input.

mengshyu commented 3 months ago

Hi @emphasis10 I wonder the differences between your customized LLava vs. standard version. I have successfully run LLava on a Samsung S23. BTW, Android currently only supports text input and does not yet support image input, so LLava is only able to be used for text-based conversations at the moment. Here are the details: LLava source: https://huggingface.co/llava-hf/llava-1.5-7b-hf Quantization: q4f16_1 MLC config: { "model": "HF://mengshyu/llava-1.5-7b-hf-q4f16_1-MLC", "estimated_vram_bytes": 4679979417, "model_id": "llava-1.5-7b-hf-q4f16_1-MLC", "overrides": { "context_window_size": 768, "prefill_chunk_size": 256 } }

@mengshyu What shall I do if I want to give an image as input for in MLCChat app running with Llava model? Which part of code should I edit? Could you kindly give any hints?

@dkjung I think there are two parts that need to be modified to support image input on Android

  1. Add a camera button on the UI interface, allowing users to take picture or select photo from the album.
  2. Llava's image preprocessing is done through the Python API. We need to rewrite it so that it can be executed on android https://github.com/mlc-ai/mlc-llm/blob/main/python/mlc_llm/serve/data.py#L86

We are currently working to support for phi3-vision on Android, but Llava and phi3-vision differ in their image preprocessing methods and configuration. Therefore, some further adjustments will be necessary.

panghongtao commented 3 months ago

I did not use the mlc_llm package for processing. I converted it directly to the model file and placed it in the app.mlc files folder. Although this requires some manual work, it may be simpler. I suggest you use the simple and clumsy method first. The model file is then placed into the APK using commands

spiritfog commented 3 months ago

Hi @emphasis10 I wonder the differences between your customized LLava vs. standard version. I have successfully run LLava on a Samsung S23. BTW, Android currently only supports text input and does not yet support image input, so LLava is only able to be used for text-based conversations at the moment. Here are the details: LLava source: https://huggingface.co/llava-hf/llava-1.5-7b-hf Quantization: q4f16_1 MLC config: { "model": "HF://mengshyu/llava-1.5-7b-hf-q4f16_1-MLC", "estimated_vram_bytes": 4679979417, "model_id": "llava-1.5-7b-hf-q4f16_1-MLC", "overrides": { "context_window_size": 768, "prefill_chunk_size": 256 } }

@mengshyu What shall I do if I want to give an image as input for in MLCChat app running with Llava model? Which part of code should I edit? Could you kindly give any hints?

I also want to deploy llava on android, and now get the same question. did you solve the problem of image input?