Closed tranhoangnguyen03 closed 2 months ago
There are several things to consider:
The OpenAI API provides the functions
parameters to ensure structured output:
If the target server running a OpenAI-like API support this => subclassing llm.openai.OpenAIChatModel
is the easiest route. You would have to:
api_base
: openai.api_base = "http://<Your api-server IP>:port"
ctx_size
property to return the context size of the target modelThis means writing something like that should work:
import openai
from llm_core.llm import OpenAIChatModel
from llm_core.parsers import BaseParser
openai.api_base = "http://<Your api-server IP>:port"
class MyOpenAICompatibleLLM(OpenAIChatModel):
name: str = "< model name >"
ctx_size: int = 4000
class MyOpenAICompatibleParser(BaseParser):
def __init__(
self,
target_cls,
model="< model name >",
completion_kwargs=None,
*args,
**kwargs
):
super().__init__(target_cls, *args, **kwargs)
# Your custom initialization code should be written here
self.completion_kwargs = (
{} if completion_kwargs is None else completion_kwargs
)
self.model_wrapper = MyOpenAICompatibleLLM(
name=model,
system_prompt=(
"Act as a powerful AI able to extract, parse and process "
"information from unstructured content."
),
)
self.ctx_size = self.model_wrapper.ctx_size
self.model_name = self.model_wrapper.name
Support for Azure deployments was added. Closing the issue.
Python-llama-cpp and OpenRouter.ai are gain traction among llm users for their being able to serve OpenSource models over an OpenAI-like API interface. Personally, I often deploy a local model to a Colab notebook then tunnel the endpoint to a Ngrok public URL for consumption.
I wonder if it's possible for py-llm-core to accommodate this pattern of usage. This will also be useful for user who are using the Azure Open AI service as it also requires providing a custom api endpoint.