Closed dfish3r closed 2 years ago
Hi @dfish3r ,
Thanks for your input in the discussion at https://shibboleth.atlassian.net/browse/IDP-1898
I'm trying to assess whether this PR would actually fix the issue I'm facing (described in the above link), where:
What I see as desired behaviour is retrying until a connection is established to the other server.
With this implementation (retrying up to getMaxPoolSize()
times), in the extreme case, when the pool is at the maximum size, with all connections established to the first server, the retries would stop one step short of reaching the other server.
When the pool is not at the maximum size, after using all connections in the pool, there would be an attempt to establish a connection with the other server.
Would you consider changing the number of retries to getMaxPoolSize() + 1
to accommodate also the situation when the pool is at the maximum size?
Cheers, Vlad
One of the features in v2 is auto-reconnect. So in the nominal case, when your active server goes down, connections should attempt a reconnect in the background to the passive server. But let's assume some pesky load balancer that doesn't send TCP resets. In that case you would be dependent on the pool validation to reconnect.
I think you make a good argument for maxSize +1 and an early version of this code worked the same way. So I'll make that change. Note that the code still respects the pool block wait time. So for large pools you may not have enough time to cycle all the connections out of the pool.
Thanks! Yes, I'm aware the timeout might not still cut it - but this would be a step in the right direction. Thanks for that!
Thanks!
Pooling implementation would benefit from handling validation exceptions on checkout. Provide an implementation that retries getting a connection. Bring back ValidationException and ActivationException for general API usage.