ggerganov / llama.cpp

LLM inference in C/C++
MIT License
65.68k stars 9.43k forks source link

Wizardllm Model Does not let user type after first prompt #1209

Closed Asory2010 closed 5 months ago

Asory2010 commented 1 year ago

for example i tell it:

User: hi AI: hello

then the cmd goes blank (as if it is pressing enter to start a new line of text)

iplayfast commented 1 year ago

The following is my script to use wizardllm models. They work fine for me.

#maximum compatiblity
./main -t 12 -m models/wizardLM-7B-HF/wizardLM-7B.ggml.q4_0.bin --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1  --interactive-first --color -r 
"Human:"
# best compromise between resource, speed and quality
#./main -t 12 -m models/wizardLM-7B-HF/wizardLM-7B.ggml.q4_2.bin --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1  --interactive-first --color -r
 "Human:"
# maximum quality 4bit, hight ram requirements and solwer inference
#./main -t 12 -m models/wizardLM-7B-HF/wizardLM-7B.ggml.q4_3.bin --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1  --interactive-first --color -r
 "Human:"
# brand new 5bit mothod. potentially higher qality than 4 bit at a cost of slightly hiher resource
#./main -t 12 -m models/wizardLM-7B-HF/wizardLM-7B.ggml.q5_0.bin --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1  --interactive-first --color -r
 "Human:"
# brand new 5 bit method slicghly highter resource usage than q5_0
#./main -t 12 -m models/wizardLM-7B-HF/wizardLM-7B.ggml.q4_0.bin --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1  --interactive-first --color -r
 "Human:"
github-actions[bot] commented 5 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.