moxin-org / moly

Moly: an AI LLM GUI app in pure Rust
https://www.moxin.app
Apache License 2.0
159 stars 19 forks source link

Problem loading Meta-Llama-3-70B-Instruct-Q8_0-00003-of-00003.gguf #159

Open jmbejar opened 4 months ago

jmbejar commented 4 months ago

When attempting to load this specific model, the following error logs can be seen. This is currently breaking the app (not hard crash but the app becomes unresponsive).

[2024-07-19T20:41:16Z INFO  llama-core] Initializing the core context
[2024-07-19 17:41:16.071] [info] [WASI-NN] GGML backend: LLAMA_COMMIT 5e116e8d
[2024-07-19 17:41:16.071] [info] [WASI-NN] GGML backend: LLAMA_BUILD_NUMBER 3405
[2024-07-19 17:41:16.072] [error] [WASI-NN] llama.cpp: llama_model_load: error loading model: illegal split file: 2, model must be loaded with the first split
[2024-07-19 17:41:16.072] [error] [WASI-NN] llama.cpp: llama_load_model_from_file: failed to load model
[2024-07-19 17:41:16.072] [error] [WASI-NN] GGML backend: Error: unable to init model.
[2024-07-19T20:41:16Z ERROR llama-core] Backend Error: WASI-NN Backend Error: Caller module passed an invalid argument
Error: Operation("Backend Error: WASI-NN Backend Error: Caller module passed an invalid argument")

This is not a problem related with recent changes in the wasmedge version. The following error lines were produced in a recent version:

[INFO] Log prompts: false
[INFO] Log statistics: false
[INFO] Log all information: false
[2024-07-18 19:29:47.868] [error] [WASI-NN] GGML backend: Error: unable to init model.
Error: "Fail to load model into wasi-nn: Backend Error: WASI-NN Backend Error: Caller module passed an invalid argument"
jmbejar commented 4 months ago

Assigning to myself as well because there is a failure in the frontend to handle the backend error gracefully.

juntao commented 4 months ago

The 70b model file is too large for the git large file system. So, it is broken into 3 files. We need to stitch them back together into one file after downloading.

This error shows that it tries to start up the third partial file in the set without the first two.

guofoo commented 4 months ago

So is the BE handling this stitching together of the files or the FE?