Proposition: Middlewares

I want to make a more structured proposition for middlewares. This is a follow up on the Discord conversation.

The idea

I propose a middleware system, where messages are sent down a chain of middlewares, send to the AI backend, and returned through the middlewares. This would allow middlewares to intercept, hijack, or transform prompts and responses, as well as store additional data about the conversation. Middlewares could also extend responses to add information for the UI to process (for visual novel like characters, or text emphasis, etc.).

Implementation

I yet have to look at the code in more details, but the general idea is pretty simple. A middleware would simply be a class that contains a "next" function to call the next middleware, or the final one which is responsible for calling the AI backend.

interface Middleware {
  invoke(prompt: PromptContext, next:MiddlewareFunction) =>ResponseContext
}

where PromptContext could look like

interface PromptContext {
  GenPreset preset;
  ChatHistory history;
  Character character;
  string prompt;
  extensions: Dictionary;
}

and ResponseContext would be the same, but with message instead of prompt.

(NOTE: I know this returns streams, not strings, but the idea is the same, except some middlewares would need to buffer the stream, some others would be able to run on the stream directly)

Use cases

Invoke a summarize AI every N messages to include into the memory
Analyze the response to add "emotional states" or "character pose" information based on the text
Filter specific words
prompt transformation and decoration

Why an issue for this

Because I want to try some crazy ideas like the summarize one, and I think this would make experimentations easier.

I'd also love to make a Phoenix Wright like text that can be influenced by the mood of the text, which would rely on the ability to run additional AIs on the pipeline, without necessarily making a full-blown feature right away.

This is all just theory, and I feel like this might be a much larger bite than I can chew. But I'd love to know what you think, if you had something similar in mind, and if not, whether experiments like this could be helpful to try.

Note that I also see that the adapters are "hardcoded", it might be a good opportunity to "pluginify" middlewares and backends together, at least structurally.

The end goal would be for the LuminAI middlewares (at least what I understood) to be simply pro and post processors as stateless services, so it's not only about experimentation though that's my personal objective.

agnaistic / agnai