Open matsluni opened 4 years ago
Any idea how to ideal with multiple regions? I wonder since depending on the region, it's a different host, so, it should handle multiple connection pools depending on the aws client usage.
Hi @gabfssilva, thanks for giving this a thought. Yes, we would need multiple pools, each for every aws service endpoint (also possible multiple regions per service).
A first naive idea coming to my mind is getting the service url from httpRequest.uri
and kind of build a map/cache with ServiceUrl -> ConnectionPool. But I don't know how feasible this is. This would be in the hot code path for every request.
That's what I thought too. A synchronized map should be enough. Well, I'll think of something.
I had another idea how a design for this could look like.
What if we extend the builder of the Akka async client with something like withCachedPoolSettings
(maybe a more suitable name is better), where we let the user provide the endpoints and regions, used in user code. Out of this, we construct the map of cachedConnectionPools and for the request its just a simple lookup, without any thread synchronization needed.
We can also decide if we want to fail (exception), if an endpoint is not in the map or fallback to the sharedPool.
This approach makes it configurable for the user and avoid the potential synchronization performance penalty.
WDYT?
I think it can be done, the only problem here is that the user would need to know which domains he needs to set up. Each AWS service has a different domain, also, using "fake aws" also implies in using different endpoints. I fear it become complex.
Instead of using a syncronized map we could use an actor to handle the pools:
//if the pool does not exist, it's created here
val pool = (pools ? Gimme(domain)).mapTo[Pool]
for {
p <- pool
r <- p.offer(request, promise)
//check `r` if the request is queued
} yield promise.future
I ran a POC over here and it worked quite well, but, hard to measure any performance pernalty over the singleRequest
approach. The only issue here is: the first request will always be much slower than the following ones, but, I'm not sure it happens already using singleRequest
.
With the current implementation a shared connection pool per
ActorSystem
is used for all requests.Its probably better to have a dedicated connection pool per Host. See akka/alpakka#1958 and akka/alpakka#1983 for similar issue and PR.