sublayerapp / sublayer

A model-agnostic Ruby Generative AI DSL and framework. Provides base classes for building Generators, Actions, Tasks, and Agents that can be used to build AI powered applications in Ruby.
https://docs.sublayer.com
MIT License
119 stars 2 forks source link

Allow OutputAdapters to be from any namespace #7

Closed drnic closed 7 months ago

drnic commented 7 months ago

Currently we only support llm_output_adapter type: :single_string to map type: :single_string to Sublayer::Components::OutputAdapters::SingleString class within Sublayer::Components::OutputAdapters namespace.

This PR adds class: key to provide a string or class:

This is my first PR and was an idea I had from reading the code and trying to understand the gist of generators + output adapters (aka functions/tools).

swerner commented 7 months ago

Yes! This is something I've been struggling with: how do we build this in a way that we have useful defaults, but have hooks to let everyone extend it easily.

This seems like a great way for people to create their own custom output adapters and experiment with different output formatting before we roll official support for it into the main library. In the past I've experimented with ideas like :array_of_strings, :selection_from_enum, and different types of object formats but looking for a few more examples to work with before committing the library to it...seeing what users create and share and ask for seems like a great way to scale that!

One goal with the output adapter pattern is trying to figure out a DSL for formatting and structuring the LLM's output that doesn't result in a massive blob of json and type annotations everywhere, while also being able to use Blueprints to generate new generators that can function well out of the box. Ultimately what the AI services call "tools/functions" I'm currently looking at as simply ways to structure the string outputs we get from them...the big unlock will be once we flesh out Sublayer::Agents here, and they can look up Sublayer::Tasks to perform, that are a much more deterministic and testable series of actions/generations

One thing thing this makes me think, and related to the issue you raised about Claude's tool call interface and to_claude3_hash is that we probably want to change this adapter/provider relationship a lot, and nail down an interface for adapters that can be stable at least for a little bit if users will create output_adapter subclasses. I'm thinking something like instead of the output adapters supporting formats and structures of the different providers, we just require that they contain all the attributes needed and have the providers decide how to use the attributes...I'll follow up with an example of what I'm thinking about shortly...but the idea would be to make it simpler for anyone to create new adapters really quickly and reliably, and trust that they'll work with any of the providers, rather than having to know the details of each provider's json or xml structure...

swerner commented 7 months ago

So here's a bit more about what I'm thinking for the output_adapters and a way to make them a lot simpler, easier to build, and move the provider-specific formatting into the providers rather than blow up adapters with more minor variations:

module Sublayer
  module Components
    module OutputAdapters
      class SingleString
        attr_reader :name, :description

        def initialize(options)
          @name = options[:name]
          @description = options[:description]
        end

        def properties
          [OpenStruct.new(name: @name, type: "string", description: @description, required: true)]
        end
      end
    end
  end
end

And then in providers, we do a lot more of the structuring:

# Sublayer.configuration.ai_provider = Sublayer::Providers::OpenAI
# Sublayer.configuration.ai_model = "gpt-4-turbo-preview"

module Sublayer
  module Providers
    class OpenAI
      def self.call(prompt:, output_adapter:)
        client = ::OpenAI::Client.new(access_token: ENV["OPENAI_API_KEY"])

        response = client.chat(
          parameters: {
            model: Sublayer.configuration.ai_model,
            messages: [
              {
                "role": "user",
                "content": prompt
              }
            ],
            tool_choice: { type: "function", function: { name: output_adapter.name }},
            tools: [
              {
              type: "function",
              name: output_adapter.name,
              description: output_adapter.description,
              parameters: {
                type: "object",
                properties: format_properties(output_adapter)
              },
              required: [output_adapter.properties.select(&:required).map(&:name)]
              }
            ]
            functions: [
              output_adapter.to_hash
            ]
          })

        message = response.dig("choices", 0, "message")
        raise "No function called" unless message["function_call"]

        function_name = message.dig("function_call", output_adapter.name)
        args_from_llm = message.dig("function_call", "arguments")
        JSON.parse(args_from_llm)[output_adapter.name]
      end

      private
      def format_properties(output_adapter)
        output_adapter.properties.each_with_object({}) do |property, hash|
          hash[property.name] = {
            type: property.type,
            description: property.description
          }
      end
    end
  end
end
drnic commented 7 months ago

Yep, I think that should work in that each Provider class is best suited to know what API (inline prompt, or API message), format (json/xml) schema, and schema (openai function vs claude3 tool) it needs.

Until we actually have more OutputAdapters, it might not be obvious what their pattern for implementation is. Difficult, nay, perilous to create abstractions/design patterns when you've only got one thing to abstract/design around :)

swerner commented 7 months ago

Hah yes totally - I held back from asking if you'd tried creating any example OutputAdapters for this use case already :)

I'd expect the output adapters implementation/interface to still change significantly over time, but my thinking for this change here would be to simplify it for today to make it easier to create a bunch of examples to design around (and also solve that claude3 issue you brought up!) - I'd worry that having a bunch of hash/xml formats to build would be a barrier to creating more adapters...

But first! Going to focus on getting that test PR merged, which will make this change much safer!

swerner commented 7 months ago

Sorry for the delay on this! Had to think about how I'll go forward with the potentially breaking changes we're talking about here.

Going to merge this, bump the patch version, and release the new version of the gem.

I'll add this to the readme, but I think what makes the most sense right now is pre-1.0, breaking changes will happen in minor versions, additional features will come in patch versions.

So some simplifying updates to input adapters will come out in 0.1.0