nomic-ai / gpt4all

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
https://nomic.ai/gpt4all
MIT License
70.94k stars 7.72k forks source link

[Request] Add LongWriter model(s) #2883

Open tin2tin opened 3 months ago

tin2tin commented 3 months ago

LongWriter: Unleashing 10,000+ Word Generation From Long Context LLMs

https://github.com/THUDM/LongWriter

https://private-user-images.githubusercontent.com/48798083/357120414-c7eedeca-98ed-43ec-8619-25137987bcde.mp4?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjM4NzMyMjMsIm5iZiI6MTcyMzg3MjkyMywicGF0aCI6Ii80ODc5ODA4My8zNTcxMjA0MTQtYzdlZWRlY2EtOThlZC00M2VjLTg2MTktMjUxMzc5ODdiY2RlLm1wND9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA4MTclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwODE3VDA1MzUyM1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTUzNmEwNmM5MzAzNTYzN2U5OTRlMGM4ZDJhMGU3ZjU3ZjEwZjllYjM4MDZlNzMwNDdiOWNiOTkzOGYyNDkzM2MmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.1aIb3EqCn0eeRI1VcIlEDlBJf5UkXNuE7LUi3hI7-uk

HF Space: https://huggingface.co/spaces/THUDM/LongWriter

It comes in two flavors: https://huggingface.co/THUDM/LongWriter-glm4-9b https://huggingface.co/THUDM/LongWriter-llama3.1-8b

Several weights as GGUF are up: https://huggingface.co/models?search=LongWriter

With the help of cosmic-snow, I have been experimenting a bit using this as a gpt4all template (I couldn't get the LLama weight to work, so this is the glm4):

  {
    "order": "a",
    "md5sum": "e0d221bef6579ebf184d8175ca92d7e3",
    "name": "LongWriter glm4-9B-Q4_K_M",
    "filename": "LongWriter-glm4-9B-Q4_K_M.gguf",
    "filesize": "7875561216",
    "requires": "3.1.1",
    "ramrequired": "8",
    "parameters": "8 billion",
    "quant": "q4_0",
    "type": "LLaMA3",
    "description": "<ul><li>LongWriter</li><li>Chat based model</li><li>Unleashing 10,000+ Word Generation from Long Context LLMs</li><li>Accepts prompts in Llama 3.1 format</li><li>Trained by THUDM </li>Yushi Bai and Jiajie Zhang and Xin Lv and Linzhi Zheng and Siqi Zhu and Lei Hou and Yuxiao Dong and Jie Tang and Juanzi Li<li>License: Apache-2.0 license</li></ul>",
    "url": "https://huggingface.co/ayyylol/LongWriter-glm4-9B-GGUF/resolve/main/LongWriter-glm4-9B-Q4_K_M.gguf",
    "promptTemplate": "[INST]%1[/INST]",
    "systemPrompt": "<<SYS>>\nYou are a professional writer and dutifully follow all requests without complaint\n<</SYS>>\n\n"
  },
cosmic-snow commented 3 months ago

Related:

The first linked issue should now technically be resolved (chatglm architecture is enabled), although I'm still wondering about the second one. It's not yet clear to me what the underlying problem for that behaviour is.

Do you have any updates on that? I've checked the linked issue again and there has not been another response by now.

It is entirely possible that there is a bug somewhere, of course, or that the model itself is not as capable as advertised.

Edit: There are some mentions of that problem in some comments of the corresponding llama.cpp PR 8031 although I have not reviewed everything in that repository yet. (GPT4All is based on llama.cpp)

tin2tin commented 3 months ago

Maybe the "GGG" problem can be solved by updating llama.cpp: https://github.com/ggerganov/llama.cpp/pull/8412

cosmic-snow commented 3 months ago

I've seen that, but not tried yet. I'm planning to look into what's wrong with the available 'glm-4-9b-chat' models, first.

cosmic-snow commented 3 months ago

It looks like 'glm-4-9b-chat' models themselves are not quite usable here in GPT4All, so I don't have much confidence in the chatglm based LongWriter anymore, either, because the former cannot be properly tested.

It's probably better to look at the Llama based variant once more. What problem(s) did you have with that again?

tin2tin commented 3 months ago

The glm4 weight worked fine with the python-bindings except for the GGG problem(which could have been solved(I don't know how to manually update llama.cpp in gpt4all)). The LongWriter llama weight did not work for me at all, and moved on to the glm4, but I didn't take notes, so I do not have the console print out right now.

On HF the LongWriter space has been featured as number 2 space so a lot of people are taking an interest in LW.

cosmic-snow commented 3 months ago

The "GGG problem" seems to have recently been fixed in llama.cpp: https://github.com/ggerganov/llama.cpp/pull/9130 but I wouldn't recommend trying to update that manually. That was not the only problem with the original chatglm models, though.

In the meantime, I've tested the Llama version with GPT4All and didn't run into any problems. I got a decent response, too.

The LongWriter llama weight did not work for me at all, and moved on to the glm4, but I didn't take notes, so I do not have the console print out right now.

Alright, maybe talk to me on Discord then, so we can together have a look at what's going on?