Closed ingalls closed 1 year ago
Sounds good to me @ingalls. I can deploy a Serverless Endpoint to get this started. To clarify, should I set it to a concurrency of [desired concurrency] x [possible number of simultaneous batches], so perhaps 10x10=100 for starters?
@ingalls, quick update: I've deployed an Serverless Inference endpoint with a concurrency of 100 and pointed the dev environment to it.
Context
For the next phase of work there are several tasks that we could take on to optimize the new Batch Infrastructure. Each of the following suggestions should be implemented sequentially, testing and ensuring that the following step is actually needed to achieve the speed vs cost we are looking for.
cc/ @rbavery @nathanielrindlaub