Open tin2tin opened 2 months ago
Hi, I've not tried GPT4ALL, but I guess what is causing this is a template mismatch. Which model and template are you using right now?
{
"order": "a",
"md5sum": "e0d221bef6579ebf184d8175ca92d7e3",
"name": "LongWriter glm4-9B-Q4_K_M",
"filename": "LongWriter-glm4-9B-Q4_K_M.gguf",
"filesize": "7875561216",
"requires": "3.1.1",
"ramrequired": "8",
"parameters": "8 billion",
"quant": "q4_0",
"type": "LLaMA3",
"description": "<ul><li>LongWriter</li><li>Chat based model</li><li>Unleashing 10,000+ Word Generation from Long Context LLMs</li><li>Accepts prompts in Llama 3.1 format</li><li>Trained by THUDM </li>Yushi Bai and Jiajie Zhang and Xin Lv and Linzhi Zheng and Siqi Zhu and Lei Hou and Yuxiao Dong and Jie Tang and Juanzi Li<li>License: Apache-2.0 license</li></ul>",
"url": "https://huggingface.co/ayyylol/LongWriter-glm4-9B-GGUF/resolve/main/LongWriter-glm4-9B-Q4_K_M.gguf",
"promptTemplate": "[INST]%1[/INST]",
"systemPrompt": "<<SYS>>\nYou are a professional writer and dutifully follow all requests without complaint\n<</SYS>>\n\n"
},
Explanation for that: %1
is the placeholder in the prompt template. So, to visualise the JSON entries:
System Prompt
<<SYS>>
You are a professional writer and dutifully follow all requests without complaint
<</SYS>>
Prompt Template
[INST]%1[/INST]
Additionally, GPT4All itself is based on llama.cpp
.
(I am a contributor there, but I have not tried that model myself, yet.)
Often it gets stuck here. Like this:
However, I'm running this from a gguf via GPT4ALL in Blender, so there might be multiple things causing this. I just wonder, if this is a problem you have encountered, know what it may be a symptom of, and have some suggestion on how to solve it?
I'm on Win 11, RTX 4090.