hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.27k stars 4.42k forks source link

Go routine leaks in consul agent on 1.8.0 #8281

Open fredstanley opened 4 years ago

fredstanley commented 4 years ago

Overview of the Issue

I see large amount of go-routines even after all the services are de-registered with consul. These goroutines stay about the same even after a long time of idle time. This is with consul 1.8.0

Consul info for both Client and Server

agent:
    check_monitors = 0
    check_ttls = 0
    checks = 0
    services = 0
build:
    prerelease =
    revision = 3111cb8c
    version = 1.8.0
consul:
    acl = disabled
    known_servers = 1
    server = false
runtime:
    arch = amd64
    cpu_count = 16
    goroutines = 9491
    max_procs = 16
    os = linux
    version = go1.14.4
serf_lan:
    coordinate_resets = 0
    encrypted = false
    event_queue = 0
    event_time = 2
    failed = 0
    health_score = 0
    intent_queue = 0
    left = 0
    member_time = 3
    members = 2
    query_queue = 0
    query_time = 1
dnephin commented 4 years ago

Thank you for the bug report! We could use some more information to better understand the problem.

Is this a problem that you discovered after upgrading to 1.8.0 from an earlier version, or is this a new Consul install and the workload has never been run on a different version?

What type of workload was Consul handling before registering services? Which APIs or CLI commands were used (I guess services were registered) ? Are you using Connect Service Mesh? If are you able to start a fresh cluster, can you isolate which operations are creating the goroutines? Is it registering a service, or is it querying the API for information?

Do both the Clients and Servers have approximately the same number of goroutines (9500) ?