Closed nathanielrindlaub closed 1 year ago
@nathanielrindlaub Multiple batches will share resources until they hit the concurrency limits of sagemaker at which point they will compete on a first-come-first-serve basis. Our current FiFo batch size is 1
so if the number of concurrent batches * BatchSize > SageMaker Concurrency
inferencing will start to see Throttling errors. There is currently no shared request pool or priority leveling of any kind between batches.
We talked about this a little bit yesterday of ideally making sure that our SageMaker concurrency was 10x the BatchSize to ensure that we should be able to run ~10 batches simultaneously without hitting throttling limits - assuming that we move to a dedicated SageMaker endpoint for batch runs.
Ok gotcha. Thank you for explaining that!
When I try triggering multiple batch uploads at once, it seems like they're getting processed asynchronously. I don't have a super clear mental model of what's happening on the worker side in that kind of a situation (I guess just pulling off messages from whatever queues are available to draw from that have messages at random), but I wanted to double-check whether images get prioritized at all between multiple simultaneous bulk uploads.
How does the logic - if any - work for that? I don't have an opinion on what's desirable I'm just hoping to understand the behavior.