A problem with the current solution/implementation, is that the concurrency is static. You set a number, and this many threads will be created for each stage of each endpoint (can be configured individually, though), each sitting with their own JMS Consumer. This number would typically be the default for the MatsFactory, hence you get '2 x cpus' x 2 (due to the dedicated interactive threads) on every single stage. The problem is that you will end up with potentially very many threads, even on endpoints that get one message per hour, or on endpoints which are very fast, where both situations could have been handled with a single thread.
If we could monitor a number that indicated load, backlog or whatever, we could scale up when there were need, and scale back down when the queues were empty.
Deduced locally:
Duty cycle: How long time is the thread in lambda processing, vs. how long it is waiting for a new message. If the duty was > 50%, then scale up until duty is below <50%. Maybe a slow and a fast exponential decay number: If the fast is going up, then increase. If the slow is going down, then decrease, unless the fast says up. Maybe if fast says duty 50-75: increase by 1. From 75%-90%: Increase by 50%. If >90%, double. Up till max concurrency. Decrease 1 thread as long as the slow was <50%, unless fast says up.
A problem with the current solution/implementation, is that the concurrency is static. You set a number, and this many threads will be created for each stage of each endpoint (can be configured individually, though), each sitting with their own JMS Consumer. This number would typically be the default for the MatsFactory, hence you get '2 x cpus' x 2 (due to the dedicated interactive threads) on every single stage. The problem is that you will end up with potentially very many threads, even on endpoints that get one message per hour, or on endpoints which are very fast, where both situations could have been handled with a single thread.
If we could monitor a number that indicated load, backlog or whatever, we could scale up when there were need, and scale back down when the queues were empty.
Deduced locally:
Needs external deduction, AFAIK: