Usage pattern suggestion: Open Source Model served over OpenAI-like API Server

tranhoangnguyen03 commented 8 months ago

Python-llama-cpp and OpenRouter.ai are gain traction among llm users for their being able to serve OpenSource models over an OpenAI-like API interface. Personally, I often deploy a local model to a Colab notebook then tunnel the endpoint to a Ngrok public URL for consumption.

I wonder if it's possible for py-llm-core to accommodate this pattern of usage. This will also be useful for user who are using the Azure Open AI service as it also requires providing a custom api endpoint.

paschembri commented 8 months ago

There are several things to consider:

The OpenAI API provides the functions parameters to ensure structured output: If the target server running a OpenAI-like API support this => subclassing llm.openai.OpenAIChatModel is the easiest route. You would have to:

modify api_base: openai.api_base = "http://<Your api-server IP>:port"
implement the ctx_size property to return the context size of the target model

This means writing something like that should work:

import openai
from llm_core.llm import OpenAIChatModel
from llm_core.parsers import BaseParser

openai.api_base = "http://<Your api-server IP>:port"

class MyOpenAICompatibleLLM(OpenAIChatModel):
    name: str = "< model name >"
    ctx_size: int = 4000

class MyOpenAICompatibleParser(BaseParser):
    def __init__(
        self,
        target_cls,
        model="< model name >",
        completion_kwargs=None,
        *args,
        **kwargs
    ):
        super().__init__(target_cls, *args, **kwargs)

        # Your custom initialization code should be written here
        self.completion_kwargs = (
            {} if completion_kwargs is None else completion_kwargs
        )

        self.model_wrapper = MyOpenAICompatibleLLM(
            name=model,
            system_prompt=(
                "Act as a powerful AI able to extract, parse and process "
                "information from unstructured content."
            ),
        )
        self.ctx_size = self.model_wrapper.ctx_size
        self.model_name = self.model_wrapper.name

paschembri commented 2 months ago

Support for Azure deployments was added. Closing the issue.

advanced-stack / py-llm-core

Usage pattern suggestion: Open Source Model served over OpenAI-like API Server #3