Improve chat completions endpoint

Bugfix

It was using the Completions model instead of the ChatCompletions model for the request input. I have fixed this and set the default max output tokens to 4096.

Feature additions

Built in accommodations for uploading audio, files, or images to the pipeline of chat completions
Adds faster-whisper and a new AudioToText.py file that will replace whisper-cpp-pybind in the next release.
Added speech to text and text to speech endpoints that are clones of the OpenAI endpoints but use multiple providers.
Adds support for the gpt-4-vision-preview model allowing images to be uploaded with the same syntax. Follow syntax from OpenAI documentation on how your request should look to send images https://platform.openai.com/docs/guides/vision
Adds support for vision models being used with ezLocalai using the same OpenAI endpoint syntax mentioned above.
Audio upload support through the chat completions endpoint has been implemented.
File upload support through the chat completions endpoint has been implemented.
Website scraping by giving the URL through the chat completions endpoint as been implemented.

Example of URL scraping, file, image, and audio uploads below in a single endpoint that also prompts the agent:

import openai

response = openai.chat.completions.create(
    model="THE AGENTS NAME GOES HERE",
    messages=[
        {
            "role": "user",
            "prompt_category": "Default",  # To override the agent's defined prompt category
            "prompt_name": "Chat",  # To override the agent's defined prompt
            "context_results": 5,  # Optional, default will be 5
            "websearch": false, # Set to true to enable websearch
            "websearch_depth": 0, # Set to the number of depth you want to websearch to go (3 would go 3 links deep per link it scrapes)
            "browse_links": true, # Will make the agent scrape any web URLs the user puts in chat.
            "content": [
                {"type": "text", "text": "YOUR USER INPUT TO THE AGENT GOES HERE"},
                {
                    "type": "image_url",
                    "image_url": {  # Will download the image and send it to the vision model
                        "url": f"https://www.visualwatermark.com/images/add-text-to-photos/add-text-to-image-3.webp"
                    },
                },
                {
                    "type": "text_url",  # Or just "url"
                    "url": {  # Content of the text or URL for it to be scraped
                        "url": "https://agixt.com"
                    },
                    "collection_number": 0,  # Optional field, defaults to 0.
                },
                {
                    "type": "application_url",
                    "url": {  # Will scrape mime type `application` into the agent's memory
                        "url": "data:application/pdf;base64,base64_encoded_pdf_here"
                    },
                    "collection_number": 0,  # Optional field, defaults to 0.
                },
                {
                    "type": "audio_url",
                    "url": {  # Will transcribe the audio and send it to the agent
                        "url": "data:audio/wav;base64,base64_encoded_audio_here"
                    },
                },
            ],
        },
    ],
    max_tokens=4096,
    temperature=0.7,
    top_p=0.9,
    user="THE CONVERSATION NAME"
)
print(response.choices[0].message.content)

Josh-XT / AGiXT

Improve chat completions endpoint #1149

Improve chat completions endpoint

Bugfix

Feature additions