schubergphilis / towerlib

A python library to interface with ansible tower's (awx) api.
MIT License
43 stars 39 forks source link

Connection pool is full #94

Closed jitterjuice closed 2 years ago

jitterjuice commented 2 years ago

Hello,

After running this through a few internal pipelines for verification, I've noticed on longer-running jobs I begin to see the following in the job monitor/output:

Connection pool is full, discarding connection: my.ansible.url.here

I understand this is part of the urllib3/requests Session management. I was wondering if you have faced this before?

I suspect it might be related to max_workers, in the _get_paginated_response method (https://github.com/schubergphilis/towerlib/blob/e4ae19f7b88d7049e1f23f279040461fa464c6e9/towerlib/towerlib.py#L293).

Are there any recommended approaches to optimize the connection pool being used?

phospi commented 2 years ago

Hi,

I can confirm this behaviour. We receive a lot of this messages as well. But up to now we never took appropriate time to analyze this.

jitterjuice commented 2 years ago

@phospi , has this caused your AWX/Ansible instance to misbehave? or is it more output from where its called? My main concern here is TCP connection pressure being put on AWX if there's many concurrent calls using towerlib.

phospi commented 2 years ago

No, definitely no misbehaviour on awx side. anyway we work containerizered, so the pressure is only at webservice container. And we never detected a bootleneck there.

jitterjuice commented 2 years ago

We work containerized as well, happy to hear no bottlenecks. Wondering if the connection pool optimization is needed, or if this warning is a false alarm that could in theory be hidden.. need to assess the impact of doing so.

costastf commented 2 years ago

I would guess that is the latter, although if you can assess, it would be amazing. I don't think there is much room for improvement there, this is pretty low level feature from urllib, but looking forward to more insights from hard usage from you guys.

phospi commented 2 years ago

As additional info: This a pure urllib issue and how to cope with that is not really the subject of towerlib. There are a lot of threads about this on stack overflow: https://stackoverflow.com/questions/53765366/urllib3-connectionpool-connection-pool-is-full-discarding-connection

By default, if a new request is made and there is no free connection in the pool then a new connection will be created. However, this connection will not be saved if more than maxsize connections exist. This means that maxsize does not determine the maximum number of connections that can be open to a particular host, just the maximum number of connections to keep in the pool.

costastf commented 2 years ago

I think it's ok to close this @jitterjuice ?

jitterjuice commented 2 years ago

Agree'd with @phospi , it is definitely a pure urllib issue. I think this issue is okay to close @costastf , but worth noting that if the message does appear for longer running jobs, I've found it has helped to tinker with the HTTP Adaptors pool_connections and pool_maxsize, such that pool_maxsize can handle the max_workers used in the _get_paginated_response method.

As a result, I've reduced/eliminated this issue appearing in job output. For anyone interested, the commit that implemented this can be found here: Connection Pool Configuration.

costastf commented 2 years ago

I think this a great change to incorporate, do you mind submitting this as a PR and add yourself to the contributors @jitterjuice ?

jitterjuice commented 2 years ago

Hi @costastf , I have created a PR as requested: https://github.com/schubergphilis/towerlib/pull/97