Add the ability to modify the OpenAI server URL

Tooniixx commented 1 year ago

Is it possible to add an option so that we can modify the server URL? So that we can add our OpenAI API compatible servers.

Thanks ! Good work !

jekalmin commented 1 year ago

Thank you for posting the first issue!

I'm currently working on upgrading "openai" dependency, and it seems possible to customize "base_url", which is set to https://api.openai.com/v1 by default.

Would customizing "base_url" be enough? (Your compatible server should provide {base_url}/chat/completions endpoint)

Tooniixx commented 1 year ago

Thank you for your work.
Customizing "base_url" should work.

This will be a good starting point. Compatible servers (LocalAI, text-generation-webui, etc.) should work.

Making it all compatible would be wonderful.

jekalmin commented 1 year ago

I added "base_url" field when adding integration like below in 0.0.7-beta2 version. Since it uses GET {base_url}/engines to validate authentication, it may fail in compatible servers if they don't support /engines endpoint. Please let me know if that's the case, then I will probably have to use different approach for those compatible servers.

Tooniixx commented 1 year ago

Indeed, on the ones I've tested, they don't support the /engines endpoint.

But it's a promising project!

jekalmin commented 1 year ago

Okay! Maybe I will have to use /chat/completions to authenticate with a dummy message.

Tooniixx commented 1 year ago

Yes, maybe! I found this on the documentation : Sans titre (https://help.openai.com/en/articles/6283125-what-happened-to-engines)

jekalmin commented 1 year ago

Could you try with 0.0.7-beta4? I used chat completion to validate for compatible servers temporarily.

Does your server support /v1/models?

Yes, maybe! I found this on the documentation : (https://help.openai.com/en/articles/6283125-what-happened-to-engines)

I was going to bump up "openai" version which supports function of /v1/models API, but it had a conflict in dependencies if used with "openai_conversation".

If compatible servers support /v1/models, maybe I will make a call by http rather than function supported by openai

Tooniixx commented 1 year ago

Great! It works.

I use LocalAI as a server, it supports /v1/models.

However, to make it work, I had to rename my language template to "gpt-3.5-turbo" during the initial module configuration.

Then, in the wizard configuration, I'm able to change the template name. But I had to use a few tricks at first.

I manage to order my entities via your module. It's amazing to be able to do this locally!

However, I can't execute scenes. In fact, when requesting scene execution, the entity returned is scene.scene.my_scene. But the real entity is scene.my_scene. I don't yet know whether this bug is due to my installation.

Coming back to the /v1/models endpoint, if it's not blocking at the moment, I think you can leave it as it is, until the dependencies are fixed.

jekalmin commented 1 year ago

Good to hear that!

Making a call to /v1/models is not a big deal. I will fix that pretty soon. I will check out scene problem. It is not because of your installation, but how model works. It might be fixed by changing prompts, description of functions, or description of function properties.

jekalmin commented 1 year ago

I just released 0.0.7-beta5, which used /models endpoint to authenticate. Please try with this version and see if work around is gone.

Tooniixx commented 1 year ago

It works perfectly! I can change the name of my model afterwards.

It seems to be stable.

However, in your initial prompt, in the available devices, how to automate the exposed_entities variable ?

jekalmin commented 1 year ago

Great! I just released this in 0.0.7 version. Thanks for your help!

For exposed_entities variable, could you elaborate more? By automating the exposed_entities variable, if you meant registering to or removing from exposed_entities, you can expose entities from http://{your-home-assistant}/config/voice-assistants/expose like below.

Tooniixx commented 1 year ago

As for exposed entities, I've declared a few.

Capture d'écran 2023-11-10 175736

But the exposed_entities variable is unknown in my installation.

jekalmin commented 1 year ago

Hmm... This is unexpected. What core version are you using?

Tooniixx commented 1 year ago

Maybe the problem is that I'm using HA in Docker version? Strange... I'm using version 2023.11.1

jekalmin commented 1 year ago

I use HA in Docker too. My development environment uses 2023.11.0.dev0 version, so it shouldn't be a problem. (I will try 2023.11.1 tomorrow) Isn't there any error or warn logs?

jekalmin commented 1 year ago

I just tried 2023.11.1, but I can't reproduce expose_entity undefined issue.

Tooniixx commented 1 year ago

I have no particular error in the logs... If other people have the same problem, we can compare.

Thanks a lot!

Someguitarist commented 1 year ago

Hi, I'm close to where you're at and would like to help solve this issue, but I'm stuck one step before you. I'm currently using Text-generation-web-ui with the OpenAI extension. Currently I can get the AI to work within home assistant using the 'Custom Open AI' integration which doesn't let me control the lights or anything, but makes me think that my Open AI extension is working. However when I try with this integration I get;

Sorry, I had a problem talking to OpenAI: Invalid response object from API: 'Internal Server Error' (HTTP response code was 500)

When you add the integration, what do you put as an API key? Since it's locally hosted, I don't believe I have an API key unless I've missed it somewhere. The 'Custom' integration didn't require it, so I think that's the part I'm missing.

Thanks for ya'lls help!

Someguitarist commented 1 year ago

Actually, I just found a mess of logs in my text-generation when I try to use it. It looks like the main complaint is 'functions is not supported'. Any idea how to get around that? Logs below;

ERROR: Exception in ASGI application

Traceback (most recent call last): File "/home/stokley/Apps/AI/text-generation-webui-main/installer_files/env/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi result = await app( # type: ignore[func-returns-value] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/stokley/Apps/AI/text-generation-webui-main/installer_files/env/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in call return await self.app(scope, receive, send) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/stokley/Apps/AI/text-generation-webui-main/installer_files/env/lib/python3.11/site-packages/fastapi/applications.py", line 1106, in call await super().call(scope, receive, send) File "/home/stokley/Apps/AI/text-generation-webui-main/installer_files/env/lib/python3.11/site-packages/starlette/applications.py", line 122, in call await self.middleware_stack(scope, receive, send) File "/home/stokley/Apps/AI/text-generation-webui-main/installer_files/env/lib/python3.11/site-packages/starlette/middleware/errors.py", line 184, in call raise exc File "/home/stokley/Apps/AI/text-generation-webui-main/installer_files/env/lib/python3.11/site-packages/starlette/middleware/errors.py", line 162, in call await self.app(scope, receive, _send) File "/home/stokley/Apps/AI/text-generation-webui-main/installer_files/env/lib/python3.11/site-packages/starlette/middleware/cors.py", line 83, in call await self.app(scope, receive, send) File "/home/stokley/Apps/AI/text-generation-webui-main/installer_files/env/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 79, in call raise exc File "/home/stokley/Apps/AI/text-generation-webui-main/installer_files/env/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 68, in call await self.app(scope, receive, sender) File "/home/stokley/Apps/AI/text-generation-webui-main/installer_files/env/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in call raise e File "/home/stokley/Apps/AI/text-generation-webui-main/installer_files/env/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in call await self.app(scope, receive, send) File "/home/stokley/Apps/AI/text-generation-webui-main/installer_files/env/lib/python3.11/site-packages/starlette/routing.py", line 718, in call await route.handle(scope, receive, send) File "/home/stokley/Apps/AI/text-generation-webui-main/installer_files/env/lib/python3.11/site-packages/starlette/routing.py", line 276, in handle await self.app(scope, receive, send) File "/home/stokley/Apps/AI/text-generation-webui-main/installer_files/env/lib/python3.11/site-packages/starlette/routing.py", line 66, in app response = await func(request) ^^^^^^^^^^^^^^^^^^^ File "/home/stokley/Apps/AI/text-generation-webui-main/installer_files/env/lib/python3.11/site-packages/fastapi/routing.py", line 274, in app raw_response = await run_endpoint_function( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/stokley/Apps/AI/text-generation-webui-main/installer_files/env/lib/python3.11/site-packages/fastapi/routing.py", line 191, in run_endpoint_function return await dependant.call(**values) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/stokley/Apps/AI/text-generation-webui-main/extensions/openai/script.py", line 128, in openai_chat_completions response = OAIcompletions.chat_completions(to_dict(request_data), is_legacy=is_legacy) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/stokley/Apps/AI/text-generation-webui-main/extensions/openai/completions.py", line 496, in chat_completions return deque(generator, maxlen=1).pop() ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/stokley/Apps/AI/text-generation-webui-main/extensions/openai/completions.py", line 174, in chat_completions_common raise InvalidRequestError(message="functions is not supported.", param='functions') extensions.openai.errors.InvalidRequestError

Someguitarist commented 1 year ago

Man, sorry to post like three times in a row, but for anyone else running into this issue it just seems like text-generation-webui's openai extension doesn't support functions. Either gotta install LocalAI or figure something else out. Fudge!

See the issue here;

https://github.com/oobabooga/text-generation-webui/issues/4286

Tooniixx commented 1 year ago

I use LocalAI, and it works pretty well!

By the way, can you tell me if you have the same problem as me in this post: https://github.com/jekalmin/extended_openai_conversation/issues/17#issuecomment-1805783667 Thanks !

Someguitarist commented 1 year ago

Thanks, I'm debating going over to LocalAI, but I really like the gui for the text-based one.

Unfortunately I can't get far enough to tell you if I have that same problem as I can't get much further till I switch over to LocalAI. I was able to expose entities but I don't get that error so far.

Someguitarist commented 1 year ago

I don't want to derail this from your original issue, but are you running an Nvidia gpu and mind sharing a docker-compose with a GPTQ model? I've been messing around with LocalAi for a bit and I have to say.... so far it's pretty awful, at least for me. It's poorly documented and has examples that just don't work?

To be clear, I can get it running using llama.cpp, but using Autogpt doesn't see the models, and exactly following the models doesn't seem to work.

At this point, if someone else reads this I'd almost rather just wait a week until I'm sure someone will functions to text-webui. If you haven't used that yet, I'd highly encourage checking it out. Way easier, way more simple, and way more customizable, at least from my limited testing.

Someguitarist commented 1 year ago

Okay, sorry, now that I've had my temper tantrum I'm just runnung LocalAI on the CPU terribly slowly, but at least for proof of concept.

How do you get to your error in #17? I can expose entities and it doesn't seem to throw me that error, however after exposing the lights and asking Assist to turn them off I get "Something went wrong: Unable to find service switch.pantry_lights"

Someguitarist commented 1 year ago

Got it! If you've previously already exposed everything to HA Assist you need to unexpose everything, then expose it again!

It's working now, thanks! Now all I have to do is get my gpu acceleration working with LocalAI, but that's a whole different thing. Thanks for the amazing integration!

Anto79-ops commented 1 year ago

Hello,

I wanted to say how well this works with OpenAI. It's brilliant how well it works with HA and my Atom5 Echo via voice

I just got LocalAI working via docker on Ubuntu. I can query it via the json commands in the terminal and would like for this to work with this integration.

Could you please tell me

What is my base_url
Why do I need a mandatory token for LocalAI? Should I just put anything in this field?

Thanks

Someguitarist commented 1 year ago

Hey Anto,

I can help you with that.

1) base_url should be 127.0.0.1:8080, if you've made no changes to the docker file. So your whole base_url line should be http://127.0.0.1:8080/v1

2) I don't think it should be mandatory either, but I just put 0 in there and it works fine.

If it doesn't work, let me know. If it does work would you mind testing something for me? When you get it set up, if you ask it to turn on and off the lights it should work, but if you ask it a general knowledge question llike 'Who is Mario?', would you let me know if that still works?

I believe with the template size I'm hitting some kind of memory limitation, as the AI can't respond to most questions, although it can still turn off and on the lights and stuff. I'm curious if you experience something similar.

Anto79-ops commented 1 year ago

Hey @Someguitarist

Thanks so much! Yes I kept everything stock, but when I try to input this information it won't let me save and errors out.

Screenshot_20231119_085956_Home Assistant.jpg

Any idea why? If I remove the :8080 it accepts it.

Someguitarist commented 1 year ago

Oh! You probably have ufw enabled! Try typing in your Linux command line 'sudo ufw allow 8080', then close the LocalAI docker and the HomeAssistant docker, and reopen them.

Anto79-ops commented 1 year ago

Okay fantastic. I will try that a bit later as I just left my home and report back in a bit thank you

Anto79-ops commented 1 year ago

One other comment I want to make just to confirm things here is that I have localAI installed as docker on a separate computer running Ubuntu. Further, my home assistant server is running on a separate Raspberry Pi.

Simply using my Chrome browser, I can actually query the model where LocalAI is installed by using the IP address url.

So I'm wondering if the base URL should still be 127.0.0.1 or should it still be the IP address of the Ubuntu machine?

Anto79-ops commented 1 year ago

I just wanted to mention something else that I tried. It accepted these credentials:

Screenshot_20231119_095444_Home Assistant.jpg

However when I tried to use it it gives me an error message.

Screenshot_20231119_095540_Home Assistant.jpg

I should mention that the LocalAI API is accessible through Chrome. I can query the model.

So I'm not sure if this is a bug of some sort or perhaps related to our conversation above for configuration

Someguitarist commented 1 year ago

Yeah, it would be the local IP of that PC. It looks like it's connecting! I saw that at one point, but I can't remember what I did to resolve it.

Try this just for a spot check; On the Extended-OpenAI integration, under the service you created hit 'Configure'. Remove everything from the prompt template, and then just write one sentence that says 'Answer questions truthfully' or something. If you leave it blank it'll autopopulate with the normal template when you hit save.

If you do that, then I can interact with everything like normal, although it can't control the house or anything, but just use that to make sure it's working.

Another thing to check; it could be related to what models you're using. I believe only stuff that works through 'llama.cpp' would actually be able to call functions. I'm currently using "vicuna-7B-1.5.Q4_k_m.gguf".

I can have a full conversation with mine now with the template being blank, but it can't control the house. When I add the template back, it can control the house, but can't seem to answer any questions. Let me know if you get it to give you an answer!

Anto79-ops commented 1 year ago

@Someguitarist I can confirm a few things now that I've been playing with this for a bit.

my base url is indeed http://192.168.1.119:8080/v1
I noticed the issue was the model I had was not correct: I can't use gpt-3.5-turbo in this integration but I can luna-ai-llama2 for this integration. Both these models work id I submit via terminal (see later)
Removing the prompt and replacing it with something else, does indeed get it to work

So, just to recap when I use the query via json like this, whether using gpt-3.5-turbo or luna-ai-llama2, they both work, but there is some errors when I use gpt-3.5-turbo with this integration. When using luna-ai-llama2, it does not seem to control my house anymore but all other queries work. Here's and example query and output:

anto@anto-HP-ZBook-17-G3:~$ curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
     "model": "gpt-3.5-turbo",
     "messages": [{"role": "user", "content": "What is an alpaca?"}],
     "temperature": 0.1
   }'

{"created":1700362191,"object":"chat.completion","id":"37ceabf8-2d0f-4346-b744-42cf2892f71e","model":"gpt-3.5-turbo","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"An Alpaca is a domesticated camelid that originated in South America. They are known for their wool, which is used in a variety of textiles and clothing. Alpacas are also used in the production of milk, cheese and meat."}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}anto@anto-HP-ZBook-17-G3:~$

Someguitarist commented 1 year ago

It sounds like we're mostly at the same place. Can run with json query, can use Assist to ask questions, but adding the template seems to break the AI. Weirdly, with the template and the 'broken AI' it will still turn on and off lights, but it just can't answer general questions. If I remove the template, it obviously can't control the house but it can answer general questions just the same as doing a json question.

I almost think it's some kind of memory limitation we're hitting by including the template. I've been messing with changing context size, or changing the number of tokens, temperature, p&k stuff.... and I just can't seem to figure it out. The next thing I plan on looking at is if there's a docker limitation that's holding it back. Maybe it's not getting assigned enough ram, or threads, or something?

But it sounds like your right where I am, at least. I'll keep playing with it this week and let you know if I come up with anything.

jekalmin commented 1 year ago

Thanks for sharing information. I will see if I can find anything related to this problem this week too.

Anto79-ops commented 1 year ago

I was playing with LocalAI more and properly set up a model. I got it running using a chat GPT front end interface all locally. Works really well whether Json or front end.

However now it completely broke this integration even if I delete the prompt.

This is the error I get:

`Sorry, I had a problem talking to OpenAl: rpc error: code = Unknown desc = inference failed {"error": {"code":500,"message":"rpc error: code = Unknown desc = inference failed","type":""}} 500 {'error': {'code': 500, 'message': 'rpc error: code = Unknown desc= inference failed', 'type': "}} <CIMultiDictProxy('Date': 'Thu, 23 Nov 2023 20:43:32 GMT', 'Content-Type': 'application/json', 'Content-Length': '94')>

I'm not an expert at this but perhaps this helps. I wonder if the model only accepts V2 type submissions but the integration can only handle V1 type queries?

Anto79-ops commented 1 year ago

actually nevermind what I said above, I had not called the proper model in the configure page, now that I fixed that, I'm back to the other errors message:

Something went wrong: function 'None' does not exist

jekalmin commented 1 year ago

Something went wrong: function 'None' does not exist

This happens when function call is called, but model tries to call a function that is not defined. Add logging to see how the model responded. Since model tries to use a function that we did not provide, the way to fix, I can think of, is tweaking with a prompt to only use provided functions, trying with a different model, or waiting for an AI expert to answer.

Anto79-ops commented 1 year ago

thanks. Interesting I enabled the log

logger:
  logs:
    custom_components.extended_openai_conversation: info

restarted HA and ran the query again and gave the same error but not logs appeared in core?

BUT I used the Voice Assistant debug feature in HA and this the trace:

then raw data:

stage: done
run:
  pipeline: 01hfz4pz3kw3t470gp93qrwkh2
  language: en
  runner_data:
    stt_binary_handler_id: null
    timeout: 300
events:
  - type: run-start
    data:
      pipeline: 01hfz4pz3kw3t470gp93qrwkh2
      language: en
      runner_data:
        stt_binary_handler_id: null
        timeout: 300
    timestamp: "2023-11-24T05:35:10.062189+00:00"
  - type: intent-start
    data:
      engine: 77b677b53c2732cd94f946f3e060f01f
      language: "*"
      intent_input: how are you?
      conversation_id: null
      device_id: null
    timestamp: "2023-11-24T05:35:10.062393+00:00"
  - type: intent-end
    data:
      intent_output:
        response:
          speech:
            plain:
              speech: "Something went wrong: function 'None' does not exist"
              extra_data: null
          card: {}
          language: "*"
          response_type: error
          data:
            code: unknown
        conversation_id: 01HFZX69REW3C2T4KY9RY8AHZ3
    timestamp: "2023-11-24T05:35:41.900511+00:00"
  - type: run-end
    data: null
    timestamp: "2023-11-24T05:35:41.900734+00:00"
intent:
  engine: 77b677b53c2732cd94f946f3e060f01f
  language: "*"
  intent_input: how are you?
  conversation_id: null
  device_id: null
  done: true
  intent_output:
    response:
      speech:
        plain:
          speech: "Something went wrong: function 'None' does not exist"
          extra_data: null
      card: {}
      language: "*"
      response_type: error
      data:
        code: unknown
    conversation_id: 01HFZX69REW3C2T4KY9RY8AHZ3

Anto79-ops commented 1 year ago

I wonder if this would also help, the model also acccepts the raw json, to give you an idea, here is the request:

curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
     "model": "gpt-4",
     "messages": [{"role": "user", "content": "How are you?"}],
     "temperature": 0.9
   }'

and this is the response:

{"created":1700720425,"object":"chat.completion","id":"d644bfba-ba1d-421d-b0a6-cc201d7fe1ca","model":"gpt-4","choices                              ":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"Hello! I'm an AI assistant that helps people            find information. How can I help you today?"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens"

I'm no dev, but if your integration outputs in json like this, it should work I think but I'm not sure if perhaps HA is not compatible with this model.

Do you have a recommended model?

Anto79-ops commented 1 year ago

Over the past couple of days, I was able to play with this a bit more. I tried another model, the vicuna 13B based model, but unfortunately, it's still not working with your integration.

However, if I do delete the prompt and the functions at the very least, I can get LocalAI into my home assistant thanks to your integration. It, of course, does not have access to my entities, which is unfortunate because the cloud-based service works beautifully well with your integration and HA.

@jekalmin, have you tested a model with LocalAI that works with your integration as intended? I'd be happy to try it out.

If you can suggest a 13b based gguf model, I can try it out and let you know!

jekalmin commented 1 year ago

Unfortunately, I haven't tested with LocalAI models yet. I'm having trouble installing LocalAI (v1.40.0) on Docker with CPU only. I get an "unimplemented" error like this. I tried to install on my Mac, but still didn't work.

So I'm still working on adding sqlite function to query states from database.

restarted HA and ran the query again and gave the same error but not logs appeared in core

Logs should appear on your system log, not on anywhere in HA web.

Anto79-ops commented 1 year ago

Oh, that's amazing to have that sqlite function!

Yes, it wasn't straightforward for me to install LocalAI. However, I found out that if you have any older version of a docker, you need to purge that first and then reinstall. I can share some install instructions for you that I used on a fresh install of Ubuntu for Nvidia GPU. Of course, not everything would apply to you, but I think most will.

Anto79-ops commented 1 year ago

also, just one more thing @jekalmin did you have a look at this page?

https://localai.io/features/openai-functions/

It seems the problem here is the implementation of functions with LocalAI. This could explain why it works when I remove the prompt and function from your integrations. Is it using a different format?

Anto79-ops commented 1 year ago

@jekalmin I spoke to some of the devs at LocalAI. Perhaps this is why it's not working with your integration:

jekalmin commented 1 year ago

0.1.0-beta1 is using tools to call functions, but 0.0.x versions don't use tools so there shouldn't be a problem with 0.0.x versions.

The problem might be something else, it can be memory limitation as @Someguitarist said on https://github.com/jekalmin/extended_openai_conversation/issues/17#issuecomment-1817888853, but not certain yet.

I will keep trying to install LocalAI first, but it seems it would be difficult to find a solution even if installed.

Someguitarist commented 1 year ago

@Anto79-ops - If I had to venture a guess it's likely that the model that you are using doesn't support functions. Only certain models and loaders actually support function calling. I'm actually able to use this integration and turn on and off lights and stuff using LocalAI. But if I ask it any question it just repeats the question back. Once family leaves I can take a look and see what models and settings I'm using, but my first guess would be to try another model. I'm on one of the Vicuna-1.5 ones, I believe.

@jekalmin - Yeah, personally I hate the LocalAI set up. I'm using it for this integration, then web-text-generation for everything else. LocalAI is easily the worst out of all of the locally hosted solutions, but so far it's the only one that's supported function calling, for me at least. If you do want some help I can probably guide you through some installation steps, although I'll say if you're only using a CPU and no GPU it probably won't be a great time.

I'm thinking my issue is the context size is running out with a template with so many entities. I have a few ideas to fix it, but haven't had the time to implement any of them. I also linked above a blog with another solution, where it asks the AI to repeat it's answer in the form of a Home Assistant JSON response, and it's monkey-patched to pull that JSON response and send it to the Home Assistant API. The advantages there is that you don't require function calling, however to get it to work with local solutions would require a lot more python than I know, unfortunately.

Either way, this integration and function calling is working for me to control entities. Just for some reason the AI can't really respond after.

Anto79-ops commented 1 year ago

thanks all.

@Someguitarist I think you are correct on saying that it dependent on the model. I think I had the vicuna model and experienced exactly that, it would repeat the question of I asked it a question.

I was speaking more about to the LocalAI dev/helper there and she said that

"if its running openai v0 you can use llama grammers but you will need to remake the system propmt and make a grammers list def and a few other things"

and quoted an example text for grammer use.

   def grammarly(task_list):
    temp_task_str = ""
    temp_task_int = 1
    for task in task_list:
        print(str(task) + f" {temp_task_int} of {len(task_list)}")
        if temp_task_int < len(task_list):
            temp_task_str += "\"" + task + "\" | "
            temp_task_int = temp_task_int + 1
        else:
            temp_task_str += "\"" + task + "\""
    grammar = f"root ::= ({temp_task_str})"
    return grammar

I then played with this integration using one of her custom models, and she confirmed that this integration outputs V0 requests

        response = await openai.ChatCompletion.acreate(
            api_base=self.entry.data.get(CONF_BASE_URL),
            api_key=self.entry.data[CONF_API_KEY],
            model=model,
            messages=messages,
            max_tokens=max_tokens,
            top_p=top_p,
            temperature=temperature,
            user=user_input.conversation_id,
            functions=functions,
            function_call=function_call,
        )

but she also told me that it uses grammers when it processes

I don't l know how much of this information is relevant (if at all), but anything to get this working with LocalAI :)

jekalmin / extended_openai_conversation

Add the ability to modify the OpenAI server URL #17