ngxson / wllama

WebAssembly binding for llama.cpp - Enabling on-browser LLM inference
https://huggingface.co/spaces/ngxson/wllama
MIT License
444 stars 23 forks source link

Phi-3: error loading model hyperparameters #106

Closed flatsiedatsie closed 3 months ago

flatsiedatsie commented 3 months ago

Just a quick question: I take it this is an issue with the model? Or is there something I can do to fix this? Perhaps add the value manually?

☠️ WLLAMA:  llama_model_load: error loading model: 
error loading model hyperparameters: 
key not found in model: phi3.attention.sliding_window

Hmm, I'm acutally pretty sure I was able to run this model in the past. Maybe something changed in llama.cpp?

I did just switch to preloading the model separately from starting it. My preload code:

let model_settings = {'allow_offline':true};
model_settings['progressCallback'] = ({ loaded, total }) => {

//console.log(`do_preload: Wllama: pre-downloading... ${Math.round(loaded/total*100)}%`);
//console.log("do_preload: Wllama: pre-downloading... percentage, loaded, total: ", Math.round(loaded/total*100) + '%', loaded, total);

if(total != 0 && loaded > 1000000){
    //console.log("loaded, total: ", loaded, total);
    window.wllama_update_model_download_progress(loaded / total);
}
}

await window.llama_cpp_app.downloadModel(task.download_url,model_settings);
ngxson commented 3 months ago

You're using an old gguf. For more info: https://github.com/ggerganov/llama.cpp/pull/8627#issuecomment-2260315554

flatsiedatsie commented 3 months ago

Ah, thank you!

Unfortunately this isn't a model I can easily replace, as it's a specialized model (Dutch language). I'll check if there is a new version of it. But if not, is there something I can do to override this manually?

// No new version, though I've asked if one is on the horizon.

ngxson commented 3 months ago

You can use play with this script to add the missing metadata: https://github.com/ggerganov/llama.cpp/blob/master/gguf-py/scripts/gguf_set_metadata.py

It would be nice to have a default value in llama.cpp code, so old models won't break. I'll have a look on this later

ngxson commented 3 months ago

This should be fixed in the latest release

flatsiedatsie commented 3 months ago

Absoutely brilliant. I'm so impressed you made an upstream fix. Thank you!