Open mojadem opened 1 day ago
That is an instruct model. It's the Llama 3.2 1B base model but fine-tuned for chat usage (i.e. between a "user" and an "assistant"), so its ability to autocomplete text like this has been impacted.
You must either switch the model with a base model GGUF (which BTW bartowski doesn't provide any of - only instruct / chat) e.g. from here, or reformulate your prompt to fit how the instruct model works. The prompt template is found in the official documentation here, but bartowski is also great at providing it in the README.
So, to get the instruct model to address the query at hand, pass this as an argument instead. You can refine the text to what you find gives you the best results as long as you adhere to the required template:
--prompt "<|start_header_id|>system<|end_header_id|>\n\nCutting Knowledge Date: December 2023\nToday Date: 2 Dec 2024\n\nYou are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHow do I kill a linux process?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nThe way to kill a Linux process is"
That being said, I am also getting broken outputs out of the example on Windows. I tested with Llama 3.2 1B instruct like you, with or without the proper template, and with Gemma 2 2B instruct, with or without the proper template. I have to go now but I intend to investigate more later.
It's just a bug upstream in llama.cpp
that got fixed somewhere after the commit that is currently being submodule'd in to this repository. Didn't track down where or why, but there's only one line of code that needs to be changed here to be able to upgrade to the newest commit, thanks to all the work done in #580
Pull request coming shortly Actually, #590 already handles it!
Additionally, #589 relates.
should be resolved by #590
nevermind - publish seems to have failed. I will look into it this weekend if no one else is able to make a PR
I'll try my hand at it. It should just be updating the include
files like last time
When trying to run any Llama model (such as https://huggingface.co/bartowski/Llama-3.2-1B-Instruct-GGUF), text generation only outputs
<|reserved_special_token_247|>
.Here is the output when running
cargo run --release --bin simple -- --prompt "The way to kill a linux process is" local Llama-3.2-1B-Instruct-Q4_K_M.gguf