More details yet to be shared, but Banana is reworking its inference API within a month, moving away from the async worker queue pattern (start / check with taskID). We believe that design has added unnecessary latency, reduced traceability for both ourselves and our users, and been a black box system which users have needed to fight with.
The Banana API will soon be "here's a URL to call your deployment directly at". You'll be able to handle multiple url routes in your server. We'll still be load-balancing calls across replicas and autoscaling from 0->n. But there will be no transformation of your call between your POST request and it arriving at your Potassium server.
We're moving any call-management logic into open-source, via Potassium.
What is this?
We add a local key-value Store object, for use between calls. All objects have TTLs, to prevent exploding memory/storage.
We also rename async_handler() to background(), which we feel better communicates that it's a background task, rather than an awaitable async function that returns results once awaited.
We hope this to be our first step toward building on Banana to feel like composing a proper backend rather than a just handler.
Users must be able to continue to use start/check async patterns as they do in Banana's current API. Background tasks, coupled with a local Store, allows users to build the start/check pattern themselves (we'll provide examples, of course).
How did you test to ensure no regressions?
Tested in demo video linked above, feel free to watch. This feature only becomes relevant to users in a month once Banana V2 API is live in a month.
If this is a new feature what is one way you can make this break?
There's currently no limit to TTL, we'll certainly be adding one before merging here. A user abusing this would have a memory leak in their server, and if on the shared cluster, cause issues to any neighbors running on the same machine.
There's currently no limit to store size, we may add one. A user abusing this can crash Banana's disks.
Users trying to put GPU-bound objects into the store will get an error at runtime as the store depends on pickle. As most users don't test before pushing to Banana (🙄), and if they do it's often on CPU machines, this error won't appear until models are deployed. (Test your code plz I'm begging you)
Foreword, Banana direction
More details yet to be shared, but Banana is reworking its inference API within a month, moving away from the async worker queue pattern (start / check with taskID). We believe that design has added unnecessary latency, reduced traceability for both ourselves and our users, and been a black box system which users have needed to fight with.
The Banana API will soon be "here's a URL to call your deployment directly at". You'll be able to handle multiple url routes in your server. We'll still be load-balancing calls across replicas and autoscaling from 0->n. But there will be no transformation of your call between your POST request and it arriving at your Potassium server.
We're moving any call-management logic into open-source, via Potassium.
What is this?
We add a local key-value Store object, for use between calls. All objects have TTLs, to prevent exploding memory/storage.
We also rename async_handler() to background(), which we feel better communicates that it's a background task, rather than an awaitable async function that returns results once awaited.
You can see a demo of it here.
Why?
We hope this to be our first step toward building on Banana to feel like composing a proper backend rather than a just handler.
Users must be able to continue to use start/check async patterns as they do in Banana's current API. Background tasks, coupled with a local Store, allows users to build the start/check pattern themselves (we'll provide examples, of course).
How did you test to ensure no regressions?
Tested in demo video linked above, feel free to watch. This feature only becomes relevant to users in a month once Banana V2 API is live in a month.
If this is a new feature what is one way you can make this break?