Closed DmitryKakurin closed 4 months ago
Does the requests return to the pool? What dis the Response thing returned ? Does it contains the body?
@benoitc the code repro above is literally runnable (example.com and .net are real existing domains). Both requests return 200.
Given free_count: 2
I assume yes, connections are returned to the pool.
I'm running into this problem as well - any updates on where things are on this?
There will be a release this WE including a fix for it.
On Thu, Oct 22, 2020 at 6:37 PM Joe Pettit notifications@github.com wrote:
I'm running into this problem as well - any updates on where things are on this?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/benoitc/hackney/issues/648#issuecomment-714616248, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAADRITYT4U5KT5O35SUGFLSMBNVDANCNFSM4P76ENZA .
I was bit pretty severely by this bug a couple days ago. Had to disable pooling entirely. Is there a release coming coming out that contains the fix?
I was bit pretty severely by this bug a couple days ago. Had to disable pooling entirely. Is there a release coming coming out that contains the fix?
What happened. How did you reproduce the issue?
There is a relase that will land today that is changing the way connections are handled, but the problem above shouldn't trigger anything severe if connections are released correctly to the pool.
I have code in my system using hackney that calls 3rd party services to get data. I saw the :checkout_timeout
error explode, added some metrics to monitor the in_use
value returned from :hackney_pool.get_stats(:default)
and redeployed.
The following screenshot is a new k8s deploy of 10 pods each with 1 default
hackney pool running:
The lines going steadily up is each pod reporting the in_use
connections and all are rising to the 200 max connections I configured. Shortly after this time the checkout_timeout errors started showing up.
I have code in my system using hackney that calls 3rd party services to get data. I saw the
:checkout_timeout
error explode, added some metrics to monitor thein_use
value returned from:hackney_pool.get_stats(:default)
and redeployed.The following screenshot is that new k8s deploy of 10 pods each with 1
default
hackney pool running: The lines going steadily up is each pod reporting thein_use
connections and all are rising to the 200 max connections I configured. Shortly after this time the connect_timeout errors started showing up.
does your code always ensure to read the body? That said I am working on replacing the pool. The code will be ready for consumption tomorrow. Took more time than expected...
Is reading the body required? I don't see that mentioned in any of the documentation. :thinking:
Is reading the body required? I don't see that mentioned in any of the documentation. 🤔
Yes to release the socket you need either read the body (which is always done when using the with_body
option in the request, or skip it using the skip_body
function or similar. Closing the request does also work. If a response is not read completely and the process is still active hackney is actually consider it as an active session for now.
that should indeed be mentionned by the doc ...
Oh, ok. We may have some locations where we are not reading the body. Would you be able to point me to the right location in the documentation that says that?
Oh, ok. We may have some locations where we are not reading the body. Would you be able to point me to the right location in the documentation that says that?
that's not written explicitly in the doc i think. A connection is removed from the pool automatically if the process is closed before releasing it or if the whole body has been read, otherwise (logically I would say) there is no way to know if the connection has to be released.
https://github.com/benoitc/hackney/blob/master/NEWS.md
1.17.0
fix memory leak in connection pool
Is this fixed now?
We downgraded hackney to 1.15.2 and wondering if it's safe to upgrade again: https://git.pleroma.social/pleroma/pleroma/-/issues/2101
watch the release this week-end. Hackney will be bumped to 2.0. This will be announced next week also :)
On Fri, Apr 30, 2021 at 6:33 PM Alex Gleason @.***> wrote:
https://github.com/benoitc/hackney/blob/master/NEWS.md
1.17.0
fix memory leak in connection pool
Is this fixed now?
We downgraded hackney to 1.15.2 and wondering if it's safe to upgrade again: https://git.pleroma.social/pleroma/pleroma/-/issues/2101
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/benoitc/hackney/issues/648#issuecomment-830214907, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAADRIXTRL52IVNG756HNIDTLLLW7ANCNFSM4P76ENZA .
Hello @benoitc! Any update on this issue?
I'm facing with it on httpoison 1.8.0
and hackney 1.18.0
.
Hackney still doesn't respect max_connections
as in @DmitryKakurin's example.
@kuznetskikh does httpoison read the body? Body need to be read or skipped to release the connection.
Hello @benoitc. I'm sorry for late response, it's working fine for me now.
I just incorrectly configured hackney pool on application start. Now all is fine, max_connections
is taken into play using:
:hackney_pool.child_spec(:main_api_pool, timeout: 15_000, max_connections: 1)
And only 1 connection is available.
Thank you!
closing the ticket since the issue was resolved. Thank you @kuznetskikh for the feedback and sorry for the late resolution.
Versions:
Hackney:
1.16.0
HttPoison:1.7.0
Elixir/Erlang:Simplest Repro:
Set max pool size to 1 and hit 2 different servers.
free_countis
is2
, but it should not be greater thanmax
, which is1
.Suspected bug location
I suspect the bug was introduced in this checkin: ec5b90c5d8d8e4e5ecc6e6f0978c416df84940ca Note how the check
if PoolSize =< MaxConn
was not carried over fromhandle_call({checkin,...
intodeliver_socket
function for theempty
case.The new check
dict:size(Clients) >= MaxConn
inhandle_call({checkout...
ensures that no more thanMaxConn
clients will be waiting for connection at the same time, but is not sufficient to limit the total pool size.