matlab-deep-learning / llms-with-matlab

Connect MATLAB to LLM APIs, including OpenAI® Chat Completions, Azure® OpenAI Services, and Ollama™
Other
97 stars 21 forks source link

Request for Custom API Endpoint Support #14

Closed toshiakit closed 1 month ago

toshiakit commented 6 months ago

per https://github.com/toshiakit/MatGPT/issues/30

Hi,

I'm exploring MatGPT's integration capabilities with LLMs and am interested in extending its utility to custom models, particularly those deployed locally.

In Python projects, customizing the base_url, as seen in openai-python https://github.com/openai/openai-python/issues/913, is a straightforward approach to support custom API endpoints.

Although I'm not familiar with the specific construction of MatGPT, could a similar method be applied here to enable the use of in-house or custom LLMs via user-defined API endpoints? Your insights or guidance on this possibility would be greatly appreciated.

This refers to the use of custom assistants https://community.openai.com/t/custom-assistant-api-endpoints/567107

toshiakit commented 4 months ago

@jonasendc provided this comment:

Hi @toshiakit , maybe something like:

client = openai.AzureOpenAI(
    api_version="2024-03-01-preview", 
    azure_endpoint="",
    api_key=api_key,
)

Where azure_endpoint is some other url than openAIs. So somewhere in this app the URL is hardcoded. This should be dynamic.`

toshiakit commented 4 months ago

@Mingzefei provided this comment:

Hi, sorry for the late reply. @jonasendc has provided a case for AzureOpenAI, and I'd like to add another case about locally deployed LLMs.

For example, projects like olloma allow for easy local deployment and utilization of many open-source LLMs. After a successful deployment, olloma by default starts an API service on local port 11434. This API service is quite similar to calling the OpenAI API, as shown below:

from openai import OpenAI

client = OpenAI(
    base_url = 'http://localhost:11434/v1',
    api_key='ollama', # required, but unused
)

response = client.chat.completions.create(
  model="llama3",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Who won the world series in 2020?"},
    {"role": "assistant", "content": "The LA Dodgers won in 2020."},
    {"role": "user", "content": "Where was it played?"}
  ]
)
print(response.choices[0].message.content)

As you can see, generally only the base_url needs to be modified.

Additionally, being able to call ollama or other locally deployed LLMs might have the following advantages:

The ability to use LLMs fine-tuned for specific tasks like MATLAB, which might yield better performance and lower costs than ChatGPT.

Providing a solution for countries or regions where ChatGPT is not available.

I hope this adds to the discussion and look forward to your thoughts.

sunweihunu commented 3 months ago

A temporary solution is as follows: 1) Just open this file +llms/+internal/callOpenAIChatAPI.m 2) change END_POINT to your Custom API Endpoint in this line: END_POINT = "https://api.openai.com/v1/chat/completions";

That should be OK for Openai-style LLMs.

ccreutzi commented 3 months ago

As noted in the PR linked above, that may or may not work with any allegedly OpenAI-compatible endpoint. A little more discovery is needed to decide how well we can support “arbitrary” end points in general.

ccreutzi commented 1 month ago

With the Azure and Ollama support now in main (including the little details they do need to be different), and support for non-local Ollama servers coming in #53, I'm closing this request. Experience with the endpoints currently supported shows that only a subset of functionality could be supported uniformly and the data formats are different in practice.