Use the batch API - Githubissues

Keyrxng commented 1 month ago

We could use the batch API and reap the benefits it provides since right now we do not need the real-time embeddings that we currently have, which is awesome but more expensive than what it needs to be.

So we'd simply store each comment as they come in as is what happens now, then send all comments in the database without an embedding every n hours.

We could have a cron job perform this function for us but it would need to be able to handle the async nature of requests to this endpoint. Alternatively build a time check into the worker itself using the updated_at of the most recent embedding filled entry.

50% cheaper
separate rate limit
24hr turn around (which makes it very clean if we do it every 24hrs)

https://platform.openai.com/docs/guides/batch/overview

0x4007 commented 1 month ago

Generally I am against the use of cron for load balancing and scaling reasons, but if this is a centralized system that can handle it for all partners with one batching script then this might be acceptable.

If we want to try and reframe the problem as having the system remember "important" conversations, then we could create batches based on some key events of the task pipeline.

~~issue filed~~
Price label set
issue closed as complete

But perhaps cron might be superior in this context.

sshivaditya2019 commented 1 month ago

We could create entry in the db with null vector. And then write a postgres function that would update the entries with null vector entries with the actual embeddings. This is way we can use pg_cron to update embeddings.

Supabase pg_cron

ubiquity-os-marketplace / text-vector-embeddings

Use the batch API #5