grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
23.81k stars 3.43k forks source link

Helm SSD pattern_ingester empty ring #13048

Open fculpo opened 5 months ago

fculpo commented 5 months ago

Describe the bug

I'm testing the Explore Logs plugin and got about 4K errors/min when enabling pattern ingester in SSD Helm on loki-write pods:

level=error ts=2024-05-28T11:00:39.949623688Z caller=tee.go:53 component=pattern-tee msg="failed to send stream to pattern ingester" err="empty ring"

The write ring is ok though: image (2)

As a result, the patterns tab in Explore Logs is not working.

To Reproduce Only added this in values:

loki: 
  pattern_ingester:
    enabled: true

The ConfigMap is correctly showing pattern_ingester as enabled

Expected behavior I could not find any more documentation on enabling Explore Logs, so I guess that enabled: true should be enough

Environment:

fculpo commented 5 months ago

Grafana errors volume: Untitled 12

matryer commented 5 months ago

Thanks @fculpo we'll have a look at this.

fculpo commented 5 months ago

Happy to help reproducing if needed.

cyriltovena commented 5 months ago

Seems like you're using SSD mode which means you need at least helm 6.2.2 and also that commit https://github.com/grafana/loki/commit/19bfef48cbad57468591e8214c4a5f390091f1e1 unfortunately support for patterns in SSD mode have not made it to 3.0.

fculpo commented 5 months ago

I'm using Loki Helm chart 6.6.2(latest one as of today), so the commit you referenced is already included.

fculpo commented 5 months ago

The docker image in Helm 6.6.2 does not support patterns ? (seems related: https://github.com/grafana/loki/issues/12691) Shall I wait 3.0.1 (this is not a blocker, I was eager to try patern_ingester - without changing docker images) ?

fculpo commented 4 months ago

Upgrading now should fix this issue ?

dehimb commented 4 months ago

Hi @fculpo ,

Thanks for raising this issue. Any workarounds?

fculpo commented 4 months ago

I did not retry with latest chart, it may have been fixed. I will report here when tested

dehimb commented 4 months ago

I did not retry with latest chart, it may have been fixed. I will report here when tested

The same issue with the latest chart version 6.6.4

KrishnaJyothika commented 4 months ago

Hi,

Any update on this? facing same issue in microservices mode deployed in kubernetes cluster.

level=error ts=2024-07-02T05:00:02.62175909Z caller=tee.go:53 component=pattern-tee msg="failed to send stream to pattern ingester" err="empty ring"

Ingester ring is up and running. Kindly help to fix this issue.

coutug commented 3 months ago

same issue here with version 6.6.6 in distributed mode

diranged commented 3 months ago

We're seeing the same issue - using the latest helm chart and Loki 3.1.0. I see on the /services page of the distributor pods that the pattern-ring-client => Running line is returned, however on our ingester pods I see:

store => Running
ingester => Running
runtime-config => Running
server => Running
memberlist-kv => Running
ring => Running

Was something missed in the Loki 3.1.0 release to turn this on?

fculpo commented 3 months ago

Hi, it seems now that I have no more said errors and patterns are correctly detected (Helm 6.7.1 SSD):

loki: {
...
        pattern_ingester: {
          enabled: true,
        },
...
}

patternIngester: {
        replicas: 3,
...
}

image

LukoJy3D commented 3 months ago

not yet working with 6.7.1 in SSD mode

image

fculpo commented 3 months ago

@LukoJy3D see my above comment, it is working in 6.7.1 (SSD), so probably only a config issue.

diranged commented 3 months ago

Hi, it seems now that I have no more said errors and patterns are correctly detected (Helm 6.7.1 SSD):

loki: {
...
        pattern_ingester: {
          enabled: true,
        },
...
}

patternIngester: {
        replicas: 3,
...
}

image

So I was hopeful this would help us in the distributed mode - but while I can get the patternIngester pods to come up, and they seem to be ocnnected to by the ingesters, the distributors don't know how to connect and throw these errors:

level=error ts=2024-07-26T17:47:46.405553709Z caller=tee.go:53 component=pattern-tee msg="failed to send stream to pattern ingester" err="at least 1 live replicas required, could only find 0 - unhealthy instances: 100.70.58.136:9095"

When I checked the helm Chart, it seems like it just doesn't create a memberlist service at all at https://github.com/grafana/loki/tree/helm-loki-6.7.3/production/helm/loki/templates/pattern-ingester, and so I don't really see how the distributors would find the pattern ingester pods. Am I missing something obvious?