Open knoha opened 1 year ago
🤔 any idea how we can fix this?
This could be an interesting issue. Seems like it is not happening under a constant load (on prod env for example). But when there are some short breaks in the load (like in local env) - I can observe that the TCP connection each time is created from scratch and not taken from a pool.
Would be good if someone else also could spend some time checking this issue. It could be that I don't fully understand how it behaves under the hood.
So there's a timing that's based on a timeout? I would start by writing a repro (unit test or a benchmark) that causes a new new TCP connection to be established, to start in a visible/measurable way.
facing same issue I tried with aiohttp with session persistence but ssl handshake happen every 15-20 sec if connection is idle
What is the bug?
An instance of TCPConnector created here, does not recreate connections, when the initial set of connections are used (utilized). When proto object is closed here (and it will be closed after default 15 keepalive seconds), recreated connection is never placed into the pool, so for each subsequent requests we need to establish a new TCP connection which strike performance really bad (in my case additional ~100-200ms on each request). So even if you have a fast query this TCP thing is just ruining your performance.
How can one reproduce the bug?
What is the expected behavior?
When a new TCP connection is created is has to be placed in the pool and utilized by subsequent calls until expires.
What is your host/environment?
Python 3.11.3 aiohttp==3.8.5 opensearch-py==2.3.0
Do you have any screenshots?
These are just print statements I've added for debuging the flow. Initially proto 1 and proto 2 are same, used from the pool.
Later, after default initial 15 seconds, proto 1 will be always None (empty TCP pool), and each time new connection will be created.
UPD: proto1 (before) and proto2 (after) logs - these are lines before and after getting a connection here.
I know this and the issue on the underlying library, but maybe it should be replaced or some other fix is done?