mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
19.17k stars 1.58k forks source link

[Question] How to config only one local model in app-config.json #1968

Closed kingrmb closed 7 months ago

kingrmb commented 8 months ago

❓ General Questions

i compiled FlagAlpha/Llama2-Chinese-7b-Chat model,and get -- Llama2-Chinese-7b-Chat-q4f16_1-android.tar ......and then sh ./prepare_libs.sh get: -- /mlc-llm/android/library/build/output/tvm4j_core.jar -- /mlc-llm/android/library/build/output/arm64-v8a/libtvm4j_runtime_packed.so i never used this two output file, there must be something wrong~~

steps: 1、first,i open ./android project by Android Studio, and run gradle sync success 2、i try to config app-config.json this way: this does not work, so how should i do, absolute path? { "model_list": [], "model_lib_path_for_prepare_libs": { "llama_q4f16_1": "prebuilt_libs/Llama-2-7b-chat-hf/Llama-2-7b-chat-hf-q4f16_1-android.tar" } }

thanks so much

Kartik14 commented 7 months ago

Hey @kingrmb, you also need to add the model to model_list property. Something like this:

"model_list": [
    {
      "model_url": "<HF link to model>",
      "model_id": "Llama-2-7b-chat-hf",
      "model_lib": "llama_q4f16_1",
      "estimated_vram_bytes": 4348727787
    },
"model_lib_path_for_prepare_libs": { 
    "llama_q4f16_1": "prebuilt_libs/Llama-2-7b-chat-hf/Llama-2-7b-chat-hf-q4f16_1-android.tar" 
    }
}

To download the model weights, either

  1. you can upload the model to HF and use that url for model_url
  2. manually push the model weights to device using adb
MasterJH5574 commented 7 months ago

Closing this issue due to inactivity. You are welcome to open a new issue for further questions.