grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
23.85k stars 3.44k forks source link

[limit] distributor: There are too many distributor goroutines #6719

Open liguozhong opened 2 years ago

liguozhong commented 2 years ago

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] There are too many distributor goroutines, can we add a max-goroutines limit configuration to avoid this happening? Our cluster has reached 12,000 goroutines

Describe the solution you'd like A clear and concise description of what you want to happen.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context image

Add any other context or screenshots about the feature request here.

dannykopping commented 2 years ago

@liguozhong can you please provide a pprof profile?

liguozhong commented 2 years ago
tracker.samplesPending.Store(int32(len(streams)))
    for ingester, samples := range samplesByIngester {
        go func(ingester ring.InstanceDesc, samples []*streamTracker) {
            // Use a background context to make sure all ingesters get samples even if we return early
            localCtx, cancel := context.WithTimeout(context.Background(), d.clientCfg.RemoteTimeout)
            defer cancel()
            localCtx = user.InjectOrgID(localCtx, userID)
            if sp := opentracing.SpanFromContext(ctx); sp != nil {
                localCtx = opentracing.ContextWithSpan(localCtx, sp)
            }
            d.sendSamples(localCtx, ingester, samples, &tracker)
        }(ingesterDescs[ingester], samples)
    }

12000+ goroutines are pkg/distributor/distributor.go:341

image

periklis commented 2 years ago

@liguozhong How many ingesters do you run in this clusters? Is the number of ingesters stable in the memberlist? AFAIU the number of goroutines should stay stable to the amount of healthy ingesters?

liguozhong commented 2 years ago

How many ingesters do you run in this clusters

How many ingesters do you run in this clusters?

72 ingesters.

The distributor goroutine rises only when the ingester is unavailable. It is expected that the distributor should not have 10000 goroutines regardless of the state of the ingester

periklis commented 2 years ago

WOW! 72 ingesters. That's huge. Can you graph the availability/unavailability of ingesters? e.g. joining/leaving the ring? I bet the map has more items than 72 because leaving/joining the ring is time sensitive.