monarchwadia / ragged

33 stars 4 forks source link

API changes for Chat and Embed #18

Open monarchwadia opened 2 months ago

monarchwadia commented 2 months ago

We should change the return API for the c.chat and e.embed methods in order to support things like getting the original response & request objects including API headers; count tokens and rate limiting information; and some additional utility functions like getLastMessage which would make life easier (instead of doing messages.at(-1)?.text which is a mouthful).

These are very much required in order to make Ragged more usable (I have some immediate use cases that demand this, for example.)

So I think Chat should turn into something like the following... and Embed will also follow suit (not shown below).

// Not a final API, just a sketch of some possibilities.
const {
  history,
  incomingMessages,
  rawResponse,
  rawRequest,
  getLastMessage
} = await c.chat("What is a rickroll?")

What properties and methods would you like to see on here?

Anything that's missing?

Anything new that needs to be added?

Anything that needs to be changed, or maybe that needs to be explained more?

monarchwadia commented 2 months ago

One more consideration: It has occurred to me that the Chat instance could just be rewritten as an instance of the BaseChatAdapter class.

If looked at this way, it opens up avenues for vertical composability of adapters.

for example if Chat actually just extended BaseChatAdapter and added a few extra bells and whistles like history persistence, then that persistence could be one part of a vertical stack of adapters.

this could find use in composition patterns ------- essentially, Ragged becomes a middleware system. This could replace Langchain's LCEL system with something that's potentially a lot more generic, standardized, and composable in a way that actually is easy to reason about.

Just a thought.

monarchwadia commented 2 months ago

WHAT WE HOPE TO GET

To support rate limits, we need a stronger type system where Adapter types seamlessly flow up to Chat. Once this is done, each adapter can define its own rate limit body, etc. And finally, we can make Chat just another implementation of the BaseChatAdapter interface.

To do this requires a few refactors. Here are the general steps.

monarchwadia commented 2 months ago

Here is a sketch of the updated ChatAdapter interface.

This will also get used as part of the public interface, so that ChatAdapters are composable.... and at that point, Ragged's Chat instance will just be another adapter. Nothing special.

import { Message } from "../Chat.types";
import { Tool } from "../../tools/Tools.types"

// ==================== Request types ====================

export type ChatAdapterRequest = {
    history: Message[];
    tools?: Tool[];
    model?: string;
}

// ==================== Response types ====================

interface GChatAdapterGenerics {
    Response: {
        RateLimits: unknown;
    }
}

export type ChatAdapterResponse<G extends GChatAdapterGenerics = GChatAdapterGenerics> = {
    history: Message[];
    rateLimits: G['Response']['RateLimits'];
    meta: {
        chatAdapterRequest: ChatAdapterRequest;
        rawFetchRequest: Request;
        rawFetchResponse: Response;
    }
}

// ==================== Adapter types ====================

export interface BaseChatAdapter<G extends GChatAdapterGenerics = GChatAdapterGenerics> {
    chat(request: ChatAdapterRequest): Promise<ChatAdapterResponse<G>>;
}

abstract class Cool {
    protected abstract cool(): void;
}
monarchwadia commented 1 month ago

On second thoughts... I'm kind of rethinking the above. I think we already have a good base with Chat and Embed. The adapters don't necessarily have to compose all the way to the top -- I actually can't think of any good use cases for that. And it'll be extremely awkward and a lot of work to do that at this point.

I think a simpler approach could be modifying the BaseChatAdapter Request to instead be a Context object... something like the following:

export type ChatAdapterRequest = {
    apiClient: ApiClient;
    request: {
        history: Message[];
        tools?: Tool[];
        model?: string;
    };
}

This way, Chat and Embed can pass down various utilities that are fully controlled & provided at the top layer. The adapters then have to do less work in order to be functional.

Hmm.... will sleep on it.