Closed drnic closed 7 months ago
Yes! This is something I've been struggling with: how do we build this in a way that we have useful defaults, but have hooks to let everyone extend it easily.
This seems like a great way for people to create their own custom output adapters and experiment with different output formatting before we roll official support for it into the main library. In the past I've experimented with ideas like :array_of_strings
, :selection_from_enum
, and different types of object formats but looking for a few more examples to work with before committing the library to it...seeing what users create and share and ask for seems like a great way to scale that!
One goal with the output adapter pattern is trying to figure out a DSL for formatting and structuring the LLM's output that doesn't result in a massive blob of json and type annotations everywhere, while also being able to use Blueprints to generate new generators that can function well out of the box. Ultimately what the AI services call "tools/functions" I'm currently looking at as simply ways to structure the string outputs we get from them...the big unlock will be once we flesh out Sublayer::Agents
here, and they can look up Sublayer::Tasks
to perform, that are a much more deterministic and testable series of actions/generations
One thing thing this makes me think, and related to the issue you raised about Claude's tool call interface and to_claude3_hash
is that we probably want to change this adapter/provider relationship a lot, and nail down an interface for adapters that can be stable at least for a little bit if users will create output_adapter
subclasses. I'm thinking something like instead of the output adapters supporting formats and structures of the different providers, we just require that they contain all the attributes needed and have the providers decide how to use the attributes...I'll follow up with an example of what I'm thinking about shortly...but the idea would be to make it simpler for anyone to create new adapters really quickly and reliably, and trust that they'll work with any of the providers, rather than having to know the details of each provider's json or xml structure...
So here's a bit more about what I'm thinking for the output_adapters and a way to make them a lot simpler, easier to build, and move the provider-specific formatting into the providers rather than blow up adapters with more minor variations:
module Sublayer
module Components
module OutputAdapters
class SingleString
attr_reader :name, :description
def initialize(options)
@name = options[:name]
@description = options[:description]
end
def properties
[OpenStruct.new(name: @name, type: "string", description: @description, required: true)]
end
end
end
end
end
And then in providers, we do a lot more of the structuring:
# Sublayer.configuration.ai_provider = Sublayer::Providers::OpenAI
# Sublayer.configuration.ai_model = "gpt-4-turbo-preview"
module Sublayer
module Providers
class OpenAI
def self.call(prompt:, output_adapter:)
client = ::OpenAI::Client.new(access_token: ENV["OPENAI_API_KEY"])
response = client.chat(
parameters: {
model: Sublayer.configuration.ai_model,
messages: [
{
"role": "user",
"content": prompt
}
],
tool_choice: { type: "function", function: { name: output_adapter.name }},
tools: [
{
type: "function",
name: output_adapter.name,
description: output_adapter.description,
parameters: {
type: "object",
properties: format_properties(output_adapter)
},
required: [output_adapter.properties.select(&:required).map(&:name)]
}
]
functions: [
output_adapter.to_hash
]
})
message = response.dig("choices", 0, "message")
raise "No function called" unless message["function_call"]
function_name = message.dig("function_call", output_adapter.name)
args_from_llm = message.dig("function_call", "arguments")
JSON.parse(args_from_llm)[output_adapter.name]
end
private
def format_properties(output_adapter)
output_adapter.properties.each_with_object({}) do |property, hash|
hash[property.name] = {
type: property.type,
description: property.description
}
end
end
end
end
Yep, I think that should work in that each Provider class is best suited to know what API (inline prompt, or API message), format (json/xml) schema, and schema (openai function vs claude3 tool) it needs.
Until we actually have more OutputAdapters, it might not be obvious what their pattern for implementation is. Difficult, nay, perilous to create abstractions/design patterns when you've only got one thing to abstract/design around :)
Hah yes totally - I held back from asking if you'd tried creating any example OutputAdapters for this use case already :)
I'd expect the output adapters implementation/interface to still change significantly over time, but my thinking for this change here would be to simplify it for today to make it easier to create a bunch of examples to design around (and also solve that claude3 issue you brought up!) - I'd worry that having a bunch of hash/xml formats to build would be a barrier to creating more adapters...
But first! Going to focus on getting that test PR merged, which will make this change much safer!
Sorry for the delay on this! Had to think about how I'll go forward with the potentially breaking changes we're talking about here.
Going to merge this, bump the patch version, and release the new version of the gem.
I'll add this to the readme, but I think what makes the most sense right now is pre-1.0, breaking changes will happen in minor versions, additional features will come in patch versions.
So some simplifying updates to input adapters will come out in 0.1.0
Currently we only support
llm_output_adapter type: :single_string
to maptype: :single_string
toSublayer::Components::OutputAdapters::SingleString
class withinSublayer::Components::OutputAdapters
namespace.This PR adds
class:
key to provide a string or class:llm_output_adapter class: "MyThing::MyAdapter"
llm_output_adapter class: MyThing::MyAdapter
This is my first PR and was an idea I had from reading the code and trying to understand the gist of generators + output adapters (aka functions/tools).