Open mbleigh opened 3 hours ago
I generally like it - but a few questions:
1. Streaming with generateX
streamGenerate()
today is generateStream()
. Do to mean to change that, or just an oversight? Otherwise:
ai.streamGenerate()
--> ai.generateStream()
ai.stremGenerateResponse()
--> ai.generateResponseStream()
2. Streaming for multi-turn generation
How do you get a streamed response from ai.send()
?
If we're being consistent with generate()
, then it would be ai.sendStream()
.
3. Arguments for ai.send
Does ai.send
accept the same arguments as ai.generateResponse()
? Does it return the same response object?
If so, what's the difference between the two?
I generally like it - but a few questions:
1. Streaming with
generateX
streamGenerate()
today isgenerateStream()
. Do to mean to change that, or just an oversight? Otherwise:
Hmm, mostly accidental but maybe intentional after some thought. The problem is that generateStream
makes sense but sendStream
sounds like you're sending the stream, not receiving one back.
ai.streamGenerate()
-->ai.generateStream()
ai.stremGenerateResponse()
-->ai.generateResponseStream()
2. Streaming for multi-turn generation
How do you get a streamed response from
ai.send()
?If we're being consistent with
generate()
, then it would beai.sendStream()
.
Yeah, forgot to write that up, ai.streamSend
would be the proposal.
3. Arguments for
ai.send
Does
ai.send
accept the same arguments asai.generateResponse()
? Does it return the same response object?If so, what's the difference between the two?
I'm imagining them as being two things, but they're really really similar so it's maybe a judgment call if they deserve to be different things. I'm imagining generateResponse
returns a GenerateResponse
which does not necessarily have send()
on it.
But maybe...maybe they are just the same thing, and the extra "stuff you want to do with the response" of send()
means that it's also sufficient for "single-turn but want more metadata".
I like the idea of calling this a Conversation
, but in theory it could maybe replace GenerateResponse? Hmm...
This is a proposed breaking change API for Genkit to streamline the most common scenarios while keeping the flexibility and capability level constant. The changes can be broken down into three components:
Default Model Configurations
While one of the strengths of Genkit is the ability to easily swap between multiple models, we find in practice that most people use a single model as their "go-to" with other models swapped in as needed. The same goes for model configuration -- most of the time you're going to want the same settings.
Proposed is to encourage setting a default model (now just called
model
) when initializing Genkit as well as the ability to define model settings when instantiating a reference to a model.Both model and configuration can still be overridden at call time, but this makes it easier to set a common reusable baseline.
Streamlining Generation
Most of the time, what you want from a
generate()
call is the data that is being generated. Today this requires a two-line "get response, get output from response" pattern which gets tedious when working with e.g. multi-step processes.Proposed is to simplify to a
generate
API that will return text or structured data depending on call configuration:This can get more complex if you want it to:
When developers do want to dig into the metadata of the response, they can use a new
generateResponse
method which will be equivalent togenerate
today.Streaming will be supported through
streamGenerate
andstreamGenerateResponse
. When doingstreamGenerate
, the chunks emitted will be in output form (either a partial data response or a string chunk):Multi-Turn Generation
All of the above is great if you only have a single turn generation, but it doesn't really help for a chatbot scenario. Fundamentally multi-turn use cases are pretty different and deserve better attention in the API surface.
Proposed is a new
Chat
class and a newsend()
method that lets you explicitly opt-in to multi-turn conversational use cases.