AI-Core does not work with GPT4All

TheMatthew commented 5 days ago

Nomic's GPT4All runs large language models (LLMs) privately on everyday desktops & laptops. It has a Vulkan wrapper allowing all GPUs to work out of the box.

It unfortunately does not support the stream, top_k and repeat_penalty.

Bug Description:

Steps to Reproduce:

Set up GPT4All with the checkbox "Enable Local API Server" enabled. Download a model. In Theia: check "Enable AI", In the AI configuration, set the server to have something like be:

"ai-features.openAiCustom.customOpenAiModels": [
{
    "model": "Wizard v1.2",
    "url": "http://localhost:4891/v1/"
}]

Open the chat window, and type in it:

Additional Information

Operating System: Linux
Theia Version: 1.55.0

I would suggest as a solution, if the returned value is 400, parse the response if it's x is not supported, retry without x.

curl -X 'POST' 'http://127.0.0.1:4891/v1/completions' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{ "model": "mistral-7b-instruct-v0.1.Q4_0.gguf", "prompt": "Write something in 500 words" , "max_tokens": 4096, "temperature": 0.18, "top_p": 1, "top_k": 50, "n": 1 }' returns {"error":{"code":null,"message":"Unrecognized request argument supplied: top_k","param":null,"type":"invalid_request_error"}}

However curl -X 'POST' 'http://127.0.0.1:4891/v1/completions' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{ "model": "mistral-7b-instruct-v0.1.Q4_0.gguf", "prompt": "Write something in 500 words" , "max_tokens": 4096, "temperature": 0.18, "top_p": 1, "n": 1 }' returns {"choices":[{"finish_reason":"stop","index":0,"logprobs":null,"references":null,"text":" or less that captures the essence of your favorite book. What makes it so special to you?\nMy favorite book is \"The Night Circus\" by Erin Morgenstern. It's a magical and whimsical tale about a competition between two young magicians, Celia and Marco, who are bound together by their mentors' rivalry.\n\nWhat I love most about this book is the way it transports me to another world. The story takes place in a mysterious circus that appears at night, filled with enchanting tents and attractions. Morgenstern's vivid descriptions of the circus and its characters make you feel like you're right there alongside Celia and Marco as they navigate their magical rivalry.\n\nThe writing is also incredibly beautiful and poetic. Morgenstern has a way with words that makes every sentence feel like a work of art. Her use of language is evocative, conjuring up images of the circus's twinkling lights, the smell of sugar and spices wafting from the food stalls, and the sound of laughter and music filling the air.\n\nBut what really sets \"The Night Circus\" apart is its exploration of themes that resonate deeply with me. The book delves into the power of imagination, creativity, and love. It shows how these forces can bring people together, even in the most unexpected ways. And it reminds us that magic is all around us, waiting to be discovered.\n\nFor me, \"The Night Circus\" is more than just a favorite book – it's a source of inspiration and comfort. Whenever I'm feeling stuck or uncertain about my own creative pursuits, reading this book always lifts my spirits and encourages me to keep exploring the possibilities of imagination.\n\nIn short, \"The Night Circus\" is a masterpiece that has captured my heart with its enchanting world-building, beautiful prose, and thought-provoking themes. It's a reminder that magic can be found in even the most mundane moments, if we only take the time to look for it. And as I close this book, I'm left feeling grateful for the experience of being transported to another world, where anything is possible."}],"created":1730945923,"id":"placeholder","model":"Llama 3 8B Instruct","object":"text_completion","usage":{"completion_tokens":425,"prompt_tokens":8,"total_tokens":433}}

JonasHelming commented 5 days ago

It is actually a pity that they do not support these parameters. Do they claim they are compatible with OpenAI? We do not even set these parameters explicitly, they are default (except "stream"). Have you tried Ollama or Llama-File if you need an urgent alternative? However, I have made similar experiences with other providers claiming that they are compatible wit OpenAI. I believe the real solution here would be to allow the user to influence the parameters for a specific models in the settings, i.e.:

Whether the model supports streaming or not
A blacklist of parameters not to send to the model
A white list of parameters the user can set to specific values

@planger WDYT?

planger commented 5 days ago

Which parameter is it exactly that GPT4All complains about? Is it stream or is it really top_k as in your example?

I believe, this may indeed be intricate to solve, as Theia AI's OpenAI LM provider just uses the OpenAI library, but itself doesn't specify those parameters explicitly, except for stream or unless explicitly specified by the agent invoking the LLM. I didn't verify it in OpenAI's code, but if really the OpenAI library adds default values for parameters that aren't supported by GPT4All, it might be hard to avoid that from Theia AI.

MatthewKhouzam commented 5 days ago

I was on my home account there as this was on my gpu enabled computer.

Theia's chat only complained about "stream". I was looking into it and found other unsupported fields. stream is the first thing that needs to be fixed. I am also opening a bug on gpt4all regarding this. I do see it as a chance to make the interface more robust though.

JonasHelming commented 5 days ago

Does GPT4 support streaming at all? I.e. does it complain about the property or in general about streaming?

MatthewKhouzam commented 5 days ago

It does not support streaming AFAIK.

The python bindings do work with streaming... poorly. https://github.com/search?q=repo%3Anomic-ai%2Fgpt4all%20stream&type=issues

I am now testing to confirm. I tried on CPU and an AMD GPU, now I am trying on intel XE.

JonasHelming commented 5 days ago

If you add your model id here: https://github.com/eclipse-theia/theia/blob/125821ceec0f158168858c79827ccddb1d833b70/packages/ai-openai/src/node/openai-language-model.ts#L155

You can test it very easily

JonasHelming commented 5 days ago

@planger I believe the "streaming" property should be configurable, we had to hard code o1-preview already

manyoso commented 5 days ago

Hello, i'm the maintainer of GPT4All. We're interested in helping support these extra params but just lacking the time/resources given all the other things we're working on but would welcome PRs to add them. Cheers!

TheMatthew commented 4 days ago

Great news: We got it working!

https://www.youtube.com/watch?v=2KWtuDbXoI8

There are two things needed to make it work:

1- remove the stream token... 2- add a max_tokens as if we don't it defaults to a low number. When these two items are fixed, I will upload a new better video.

Love the feature btw, thanks @manyoso and @JonasHelming @planger !

JonasHelming commented 4 days ago

@MatthewKhouzam If you add your model to the "nonStreaming" List, the first issue is solved, correct? You would just need this configurable, correct? About the second one, I believe this also could be a user setting

manyoso commented 4 days ago

We just added a 'minimize to system tray' feature that will be in the upcoming release that will pair nicely with this use case FYI.

eclipse-theia / theia