Open Keyrxng opened 1 month ago
Generally I am against the use of cron for load balancing and scaling reasons, but if this is a centralized system that can handle it for all partners with one batching script then this might be acceptable.
If we want to try and reframe the problem as having the system remember "important" conversations, then we could create batches based on some key events of the task pipeline.
But perhaps cron might be superior in this context.
We could create entry in the db with null vector. And then write a postgres function that would update the entries with null vector entries with the actual embeddings. This is way we can use pg_cron
to update embeddings.
We could use the batch API and reap the benefits it provides since right now we do not need the real-time embeddings that we currently have, which is awesome but more expensive than what it needs to be.
So we'd simply store each comment as they come in as is what happens now, then send all comments in the database without an embedding every n hours.
We could have a cron job perform this function for us but it would need to be able to handle the async nature of requests to this endpoint. Alternatively build a time check into the worker itself using the
updated_at
of the most recent embedding filled entry.https://platform.openai.com/docs/guides/batch/overview