go-kit / kit

A standard library for microservices.
https://gokit.io
MIT License
26.53k stars 2.43k forks source link

Consul Implementation of Instancer spams Consul and pushes consul agents to 100% CPU #1214

Closed jkratz55 closed 1 year ago

jkratz55 commented 2 years ago

What did you do?

Using the GoKit implementation of Instancer for Consul in GoKit 0.11 and up results in Instancer spamming Consul with hundreds of RPC requests/sec on any register/deregister event that occurs after the Instancer is created. This resulted in our Consul agent pods going to 100% CPU and overwhelming our Consul cluster.

Steps to re-create:

  1. Start Consul

  2. Register a dummy service

curl --location --request PUT 'http://localhost:8500/v1/agent/service/register' \ --header 'Content-Type: application/json' \ --data-raw '{ "name": "db-example", "id": "db-example", "address": "127.0.0.1", "port": 5432, "tags": [ "database" ] }'

  1. Open a terminal and run the following command to get insights into what the Consul agent is doing

consul monitor -log-level=trace

  1. Create Instancer for db-example service (see sample code below) and run app/code

  2. Deregister the service

curl --location --request PUT 'http://localhost:8500/v1/agent/service/deregister/db-example'

Observe the behavior from consul monitor, the agent is getting slammed with requests and it never stops until the Instancer is stopped or application exits.

Simple example application.

package main

import (
    "fmt"
    "github.com/go-kit/kit/log"
    "github.com/go-kit/kit/sd"
    "github.com/hashicorp/consul/api"
    "time"

    "github.com/go-kit/kit/sd/consul"
)

func main() {

    apiClient, err := api.NewClient(api.DefaultConfig())
    if err != nil {
        panic(err)
    }
    client := consul.NewClient(apiClient)

    instancer := consul.NewInstancer(client, log.NewNopLogger(), "db-example", []string{}, true)

    events := make(chan sd.Event, 1)
    go func() {
        instancer.Register(events)
    }()

    go func() {
        for event := range events {
            fmt.Println(event.Instances)
        }
    }()

    time.Sleep(1 * time.Hour)
}

What did you expect?

Consul agent should not be getting spammed with hundreds of requests a second.

What happened instead?

Instancer spammed Consul with way too many requests and drove the CPU utilization to 100%.

7.0.0.1:53702 latency=9.375µs 2022-02-12T18:06:20.710-0500 [DEBUG] agent.http: Request finished: method=GET url=/v1/health/service/db-example?index=155&passing=1 from=127.0.0.1:53702 latency=12.208µs 2022-02-12T18:06:20.710-0500 [DEBUG] agent.http: Request finished: method=GET url=/v1/health/service/db-example?index=155&passing=1 from=127.0.0.1:53702 latency=162.792µs 2022-02-12T18:06:20.710-0500 [DEBUG] agent.http: Request finished: method=GET url=/v1/health/service/db-example?index=155&passing=1 from=127.0.0.1:53702 latency=54.791µs

icowan commented 2 years ago

我也遇到了这个问题

icowan commented 1 year ago

go get github.com/go-kit/kit@master [e6a9818]

jkratz55 commented 1 year ago

This can be closed since this was merged and is now in the v0.13.0 release.