Closed candlerb closed 2 months ago
I found this in the source code at pkg/pattern/instance.go
:
if _, ok := streamMetrics[lvl]; !ok {
level.Warn(i.logger).Log(
"msg", "unknown log level while observing stream",
"level", lvl,
"stream", stream,
)
lvl = constants.LogLevelUnknown
}
Clearly, large numbers of these messages could be generated. Simply logging them at "Debug" level might be a simple solution. Keeping a map of unknown levels, and sending a warning only once for each one seen, might be another. (With an overall size limit on the map)
According to grafana, the volume of logs generated per minute is growing... I will need to turn this pattern ingester off. (EDIT: it's happy after doing that)
@trevorwhitney is there a reason why we only would support those configured log levels: https://github.com/grafana/loki/blob/913e9f93477b5b811fbcf44d0e750f600c9ded69/pkg/util/constants/levels.go
The reason for hardcoding the levels we support was for performance.
We read the level from structured metadata to avoid re-parsing / re-detecting log level. I made the mistake in reading this code that we'd only be populating structured metadata with one of these constant levels. In the case we get it from a label it looks like it could be anything.
The performance gain was just to pre-allocate the map, so I can change that.
This should fix it: https://github.com/grafana/loki/pull/14255
Thank you.
On my side, I realised I should also ensure that there's no chance loki can ever ingest its own logs. I'm thinking of running loki and promtail in separate containers, so that rsyslog in the promtail container never receives any journald or /dev/log messages from loki.
I've been keeping track of this bug, has this already been merged into Loki 3.2.1 public docker image?
@carlopalacio apparently not, have the same problems with the Docker image v3.2.1
It will definitely be in 3.3: https://github.com/grafana/loki/pull/14750, which is planned to go out tomorrow.
(I'm not sure if this is user error, but I am raising in bug format anyway just to capture the details)
Describe the bug I am trying out loki 3.2.0 and attempting to use the explore logs / pattern ingester functionality.
I am getting a very large volume of these messages logged to loki's stdout, which I wasn't before:
I have promtail (2.9.1) receiving syslog messages and forwarding them to loki, and I believe this is where severity="informational" is coming from - see syslog_message.go
However the fact that these messages have
{app="loki-linux-amd64",service_name="loki-linux-amd64",severity="informational"}
suggests that either there is a loop (loki is eating its own messges), or loki is generating them with severity="informational", even though earlier on the line sayslevel=warn
. I don't know enough about the provenance of these labels to diagnose further.It seems to me that:
logcli query '{severity="informational",app!="loki-linux-amd64"}' --since=1h --limit=10
)I note that the Grafana dashboard Explore Logs plugin (which is the whole point of this exercise) appears to be happy with "informational" as a level:
To Reproduce
/etc/loki/loki.yaml
/etc/systemd/system/loki.service
/etc/default/loki
/etc/loki/promtail.yaml
/etc/systemd/system/promtail.service
/etc/default/promtail
/etc/rsyslog.d/25-promtail.conf
Expected behavior Messages to be processed successfully
Environment: