hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.31k stars 4.42k forks source link

Memory Leak With consul client #12564

Open timhungdao opened 2 years ago

timhungdao commented 2 years ago

Overview of the Issue

We use Consul with Haproxy and the consul client sometimes eats all the memory in the server and gets killed due to Out Of memory. image

image

Reproduction Steps

It usually happens when we restart the consul client.

Consul info for both Client and Server

Client info We use the latest consul client (1.11.4)
Server info Consul v1.10.2 Revision 3cb6eeedb Protocol 2 spoken by default, understands 2 to 3 (agent will automatically use protocol >2 when speaking to compatible agents)

Operating system and Environment details

We use Centos 7

Log Fragments

consul-debug-2022-03-16T11-15-39+0700.tar.gz

Amier3 commented 2 years ago

Hey @timhungdao

I see this is your first issue 👀 so welcome to the consul community!

Could you provide us with the results of consul info and telling us what you're using consul for? It'd help us understand what could be causing your performance issues.

It could also be helpful to see your haproxy config in /etc/haproxy/haproxy.cfg

timhungdao commented 2 years ago

Hi @Amier3, Please check below for the consul info and HaProxy configuration. We use consul for service discovery with HaProxy. Basically, Haproxy will use DNS Resolver which points to Consul DNS to discover the services. The HaProxy is vital for our system and the consul agent memory leak could take the system down. For now, we have to point the HaProxy DNS resolver directly to the Consul servers without a local agent.

config.txt

blake commented 2 years ago

Hi @timhungdao,

Another user recently identified a bug in HAProxy that causes a high number of DNS queries to be sent to Consul. I suspect this would cause increased memory utilization on the agent.

Could you review this thread and let us know whether this sounds like the problem you are experiencing? https://discuss.hashicorp.com/t/hundreds-of-dns-lookups-per-second-for-8-digit-hex-address-addr-mydomain-consul/36534

Thanks.

timhungdao commented 2 years ago

Hi @blake, We use HaProxy 2.2 and it is not the problem. Even if there are many DNS queries it would not create that OOM problem. We can add more memory but won't work. For now, we have to point HaProxy to Unbound and forward that to the Consul cluster.