Closed samos123 closed 7 months ago
It might be feasible to keep the message body the exact same as an OpenAI http request. PubSub and other cloud offerings all have native support for message metadata.
I prefer to keep it as generic and flexible as possible without relying on message metadata features of specific providers. It would limit our ability if we later do need to add something to the message. It would also limit our ability to add a new provider and support the same API if that provider has a slightly different or no implementation of metadata. I guess I don't see any benefits of using the native message metadata support.
It's also important to enforce the id parameter so people take a moment to add an ID field because otherwise it will be almost impossible for the end-users to know which request triggered the response.
I still think having a body field is preferred.
I think .id [int]
is not needed. The user can populate .metadata.*
with their identifier(s). They might have a string id, they might have a composite id made up of multiple fields.
We do need a way of determining what HTTP path the user is trying to invoke. Generically this could be a .path
field (ex: /v1/completions
). Some considerations:
.body
field.This makes me consider having the user specify a .type
field that would be an enum: completion
, embedding
, chat
.
+1 to letting the user decide if they want id as int in the metadata or not.
I would vote for a path field for now. I do think keeping body is important so we are not limiting ourselves to OpenAI API. We might want or need to support other apis in the body later.
Next thing to consider is what to do about non-retry-able errors?
I think retry-able errors should be nack'ed (redelivered later). Note: PubSub can be configured with an expo-backoff redelivery strategy as to not clog the queue.
Some examples of non-retry-able errors:
My thought for non-retry-able errors: send back an error response on the same responses topic with the error details, and allow the response consumer to filter and take action accordingly.
Other options:
Right now what we do with an end-user is to send back the error as part of the response. In the end it's just a valid response imo that happens to have an error.
To keep the MVP simple, my suggestion would be either no retries (send back whatever the response is) or treat all errors equal:
x
retries no matter what kind of error it is. It may be a waste to retry a malformed request 3 times, but we can optimize later, not a big deal for MVP imo.{"metadata": dict, "response": dict}
PR is currently in a place where there are no retries on failure - however the failures do propagate back on the response topic. We can either implement retries in this PR or follow up with another one.
Note: I have implemented the PR with an un-nested json structure for requests and responses:
# Request
{
"path": "/v1/completions",
"metadata": {"key": "val"},
"body": {"model": "my-model", "prompt": "whats your favorite color?"}
}
# Response (on another topic)
{
"status_code": 200,
"metadata": {"key": "val"},
"body": {"choices": [{"text": "My favorite color is blue"}]}
}
Note the error structure follows the same convention right now: https://github.com/substratusai/lingo/pull/88/files#r1545874400
Ability for Lingo to directly listen and publish to a pub/sub topic
Messages should be using JSON
Request Message format:
Output format:
The output message also contains the metadata of the request. This is needed such that end-users can easily join back the original request with the response they receive in another pubsub topic. Metadata is required because otherwise users risk sending requests but having no way to correlate the request and response.
MV error handling:
x
retries no matter what kind of error it is. It may be a waste to retry a malformed request 3 times, but we can optimize later, not a big deal for MVP imo.{"metadata": dict, "response": dict}
Things to consider: