Closed sona0108 closed 1 year ago
Hello,
Thanks for your report! It seems that there's a bug causing an infinite loop in the peers applet (which means that your config is larger and has a "peers" section). By the way, it's fine to trim the config for privacy purposes, please just think about mentioning a few points such as "in addition there's a peers section with 2 peers" or "we have 12 frontends and 40 backends" etc, that does help eliminate some possibilities sometimes.
Could you please retrieve the core file and open it with haproxy under gdb, then issue t a a bt full
so that we know where the threads were running ? I'm particularly interested in resolving that address 0x5606b154335a
to a line number. Just out of curiosity, is it a distro package or is it a version that you built internally ?
Thanks!
Hello,
It is a distro package.
latest stable package, which isn’t available from any distro. it is the standard stable tgz pulled from haproxy and replaced in the distro rpm (with the 1 patch removed as it was already present in the source).
“we also enlarged the stick tables to give us a longer run time before failure. We found that we could run a ‘reload’ and it would briefly spike (likely the result of the old process copying stick tables into the new process) and then return to normal operation and clean up the old entries.
and I will share the core file under gdb later.
OK. Be careful not to share the core file itself, just the gdb's output!
@sona0108, in ticket #2034 @alekseyp-amzn is right and found a bug in the expiration algorithm that will trigger every 49.7 days. In addition, all expired entries are purged in one loop, so I suspect that the issue of extraneous elements rotting in the table can cause a violent purge at one moment and trigger the watchdog. We're currently working on a patch (tested on 2.4 and 2.8-dev, now needs to be finished, cleaned and merged). I think I'll also see how to implement some limits to avoid purging millions of entries at once. No need for the core anymore :-)
Hello,
Thank you for the update, I think it does make sense Please let me know once you have the new patch.
We currently have the patch in 2.8-dev and 2.7-maint. It still needs to be backported to older releases. I don't think it will trivially apply to 2.4 so I need to check first. The last occurrence of the wrapping was on Jan 30th and the next one is for Mar 20th 6:10pm so we still have a bit of margin.
currently our HAproxy version is 2.4.18 or so do you think once the new patch is ready for the 2.4 it will work? or we need to upgrade it with the latest version in which the patch is already available
Yes it will work, as I could already test it there when I first reproduced the problem. No need to rush an upgrade yet ;-)
sure Thank you very much!
as I know you are working on patch till then we will not upgrade anything. but one thing, do we have same patch on version 2.6? in future we are planning to upgrade HAproxy, so please let me know what is your thought.
Yes the patch is available here if you want: http://git.haproxy.org/?p=haproxy-2.6.git;a=commitdiff_plain;h=75cf53393
But we'll issue another 2.6 tomorrow due to a security issue so I'd suggest to wait for next version (that shouldn't prevent you from trying the patch above on your side though).
Next version what is would be?
Unless I'm mistaken it should be 2.6.9.
Ok then I will suggest my team to wait for 2.6.9 do you have any ETA?
As I said, it's tomorrow. We planned 5pm CET.
Thank you very much! please let me know once it is ready.
It will be announce like other ones anyway.
2.6.9 is out now.
I'm marking the problem as fixed and we can wait a day or two before closing.
Thank you very much for the update! please do not close the case now, give us same days, will update you here.
Hi @sona0108, were you able to try out 2.6.9 or later to confirm that the issue is gone? Thanks
A gentle ping
The 2.6.14 was released. I'm closing the issue because a fix was provided. Feel free to reopen it to fill more info if the issue is not fixed. Thanks !
Detailed Description of the Problem
This is the previous case https://github.com/haproxy/haproxy/issues/1720 we have raised, we want to continue the discussion. We have 4 haproxy servers in our environment, and they have some stickiness following issues: 1- we observe in one or 2 months sticky tables graph go high 2- As you can see in the screenshot I have attached both the sticky tables are not aligned even they should increase or decrease parallel .
due to which we face downtime for the customer as all the services go down automatically can you please help us in this how we can resolve the stickiness of HAproxy tables.
here we are raising the request because we think this is likely a bug in HAProxy, even we have upgraded HAproxy to 2.4.18
Expected Behavior
We have upgraded our HAproxy environment with the version 2.4.18 and we think it should not be like this.
Steps to Reproduce the Behavior
NA
Do you have any idea what may have caused this?
No response
Do you have an idea how to solve the issue?
No response
What is your configuration?
Output of
haproxy -vv
Last Outputs and Backtraces
Additional Information
We did not applied any patch No change in environment. nothing happened that can cause the issue.