Open mum4k opened 2 years ago
We used to call drainConnections()
.. but checking this call was removed in https://github.com/envoyproxy/nighthawk/commit/15b86be8ff62b6ab6de41511d37d445f240c56d4#diff-d638190c1ff04b6501032412ab8ae3d4322513fd75e43006ac399fc829be8328L112
That was a while ago, and I have no recollection of any reason why that call had to be removed. Therefore, I think that it may have been dropped by accident :-(
Thank you for adding the details @oschaaf. This gives us reassurance that adding it back it should do no harm and should be an easy fix.
Did a quick check into this and this might require an upstream envoy change. https://github.com/envoyproxy/envoy/pull/16544/files obfuscated the connection pool behind a new object called PoolData. https://github.com/envoyproxy/envoy/pull/16865/files is an example change made previously by @mum4k that re-exposed some methods. A similar change is probably required to allow us to drain connections.
The pool termination code is here.
While the code does register an idle callback, that callback only gets called if the pool is draining. Nighthawk doesn't seem to trigger pool drain anywhere. We should call
drainConnections(DrainAndDelete)
while waiting for the pool to drain.The current state can cause Nighthawk to crash, because the shutdown thread owns the client dispatcher and may end up reading packets while shutting down resulting in a call to pure virtual method.
This can be reproduced by running high QPS test using HTTP3/QUIC, inducing a situation where Nighthawk has to deal with some late responses after it started shutting down.