Open rmusser01 opened 2 months ago
Upstream issue: https://github.com/mlc-ai/web-llm/issues/282
Might already be possible with web-llm. Documention isn't clear but this example seems to allow model uploads: https://github.com/mlc-ai/web-llm/tree/main/examples/simple-chat-upload
That example "uploads" the model to IndexDB, which means it creates a copy of the whole thing on disk instead of merely reading it into memory. For large models that's pretty expensive.
Ah I see. While it would be ideal to not duplicate the storage, I think not having to download from the internet is still a win. Happy to support both options here.
From @youhogeon in #18,
Our company uses a closed network. All files from external sources must be imported via USB(or an equivalent method).
So, first, I download the wasm file and parameters of model, import them into a closed network, and then temporarily modify the App.tsx file as follows.
However, I hope it will help you set up your model in a better way.
Thank you again for releasing your great code as open source.
const appConfig = webllm.prebuiltAppConfig; appConfig.model_list = [ { "model_url": "/models/Llama-3-8B-Instruct-q4f16_1-MLC/", "model_id": "Llama-3-8B-Instruct-q4f16_1", "model_lib_url": '/models/Llama-3-8B-Instruct-q4f16_1-ctx4k_cs1k-webgpu.wasm', "vram_required_MB": 4598.34, "low_resource_required": true, }, ] appConfig.useIndexedDBCache = true;
As a user, I'd like to be able to use the application, and have it load a model I have already downloaded previously.