Batch Efficiency Thoughts

ingalls commented 1 year ago

Context

For the next phase of work there are several tasks that we could take on to optimize the new Batch Infrastructure. Each of the following suggestions should be implemented sequentially, testing and ensuring that the following step is actually needed to achieve the speed vs cost we are looking for.

Create a new SageMaker Serverless Inference Endpoint with a concurrency 10x of the desired number of simultenous batch tasks ~10-20. Then point the Batch Inferencer against the new endpoint and remove priority checks
Increase the BatchSize processed by a single Batch Inference call from 1 => 10 and add support for sending multiple inferences concurrently via the batch Inference processor
Remove the FiFo queues entirely and switch back to normal SQS for even further increased concurrency.

cc/ @rbavery @nathanielrindlaub

nathanielrindlaub commented 1 year ago

Sounds good to me @ingalls. I can deploy a Serverless Endpoint to get this started. To clarify, should I set it to a concurrency of [desired concurrency] x [possible number of simultaneous batches], so perhaps 10x10=100 for starters?

nathanielrindlaub commented 1 year ago

@ingalls, quick update: I've deployed an Serverless Inference endpoint with a concurrency of 100 and pointed the dev environment to it.

tnc-ca-geo / animl-ingest

Batch Efficiency Thoughts #54

Context