eth-sri / lmql

A language for constraint-guided and efficient LLM programming.
https://lmql.ai
Apache License 2.0
3.48k stars 191 forks source link

Proposal for new KeyWord `using` #330

Open gamendez98 opened 4 months ago

gamendez98 commented 4 months ago

This proposal is a generalized solution for this issue.

Currently there are some proposals to add support for a variety of new features in some LLMs. However, this is not practical. Given that each vendor/model may implement custom features it doesn’t make sense having to extend LLMQ every time there is an innovation.

To be able to solve this and to avoid LMQL to be left behind I propose to add a new Keyword to the language using:

"[OUTPUT]" where … using ModelCallConfig(**kargs)

This keyword lets the user pass a ModelCallConfig that can define arbitrary arguments for the model call. The ModelCallConfig object can achieve this by modifying the data that is to be sent to the model.

To do, this we can define in the class a serializer and a “injector” for each of the parameters accepted to the config. The “injector” takes as an input the full prompt data and inserts the serialized parameter in the appropriate place.

The data is only passed through an injector if its associated parameter was passed. Also the default serializer is the __str__ function.

Example

Here is an example of usage for the previously mentioned issue:

class OpenAIToolsCallConfig(OpenAICallConfig):

    def __init__(tools: List[Callable], **kargs):
        self.tools = tools
        super().__init__(**kargs)

    @lmql_param_serializer("tools")
    def tools_serializer(self, tools: List[Callable]):
        return [{
            "type": "function",
            "function": {
                "name": tool.__name__,
                "description": get_description_from_docstring(tool.__doc__),
                "parameters": {
                    "type": "object",
                    "properties": {
                        p_name: {"type": p_sign.annotation.__name__, "description": get_description_from_doc(tool.__doc__.p_name)}
                        for p_name, p_sign in signature(self.prompt).parameters.items()
                    },
                "required": []
                }
            }
        } for tool in tools]

    @lmql_param_injector("tools")
    def tools_injector(self, data):
        data["tools"] = self.tools_serializer(self.tools)
        data["tool_choice"] = "auto"
        return data

And then in LMQL

def add(a: int, b: int):
    '''
    Adds two numbers.
    This function takes two parameters, 'a' and 'b', and returns their sum.

    Parameters:
    - a (int): The first number.
    - b (int): The second number.
    '''
    return a + b

"{:user} I need you to add two numbers together 1 and 2"
"{:asistant}[answer]" using OpenAICallConfig(tools=[add])
do_something(answer)

so what normally would look like this

{
  "model": "gpt-3.5-turbo",
  "messages": [
    {
      "role": "user",
      "content": "I need you to add two numbers together 1 and 2"
    }
  ]
}

would become

{
  "model": "gpt-3.5-turbo",
  "messages": [
    {
      "role": "user",
      "content": "What is the weather like in Boston?"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "add",
        "description": "Adds two numbers. This function takes two parameters, 'a' and 'b', and returns their sum.",
        "parameters": {
          "type": "object",
          "properties": {
            "a": {
              "type": "int",
              "description": "The first number."
            },
            "b": {
              "type": "int",
              "description": "The second number."
            }
          },
          "required": []
        }
      }
    }
  ],
  "tool_choice": "auto"
}

Other uses

Open question

maybe some model features would require custom handling of the response, in which case maybe it would be necesary in some cases to add a callback to the config.