Closed sandstrom closed 5 years ago
What sort of support are you looking for?
Loki is format-agnostic and ingests log lines as string lines, may they be access logs, logfmt key/value pairs, or JSON.
Grafana's Explore UI shows you the log lines, and if they are JSON, some support for in-browser parsing to plot distributions of values. Notice how the fields in the log line have an orange underline, that means they were parsed successfully:
Does it have an ability to search against json fields?
it would be great if from example, i could filter my logs through a level
field contained in the log
@r-moiseev You can use regex as part of the query to "search" against your json fields.
@draeron I don't think this feature will be available soon. In the mean time, you can add the "key" you want to filtered as part of your label that send to the Loki server. Or You could add additional parser for promtail that parse your logs in json format.
Just to share an example, in GKE/Stackdriver I can just search for key/values in the jsonPayload:
jsonPayload.@l="Warning"
Would be awesome to have some direct support for structured logging.
With the 0.1.0 release there is included a pipeline which includes a json stage that allows extraction of json log data to be used in labels and/or metrics using JMESPath expressions.
Great! š„
I have a pod that emits logs in json format. But the logs are not being displayed as nested objects in Loki but a long string (the content of log
field):
{"log":"{\"verb\":\"UPDATED\",\"event\":{\"metadata\":{\"name\":\"minio-backup.15c16e910ad17555\",\"namespace\":\"minio\",\"selfLink\":\"/api/v1/namespaces/minio/events/minio-backup.15c16e910ad17555\",\"uid\":\"75b2664e-d08a-11e9-aaf1-42010aa40066\",\"resourceVersion\":\"805300\",\"creationTimestamp\":\"2019-09-06T09:41:11Z\"},\"involvedObject\":{\"kind\":\"CronJob\",\"namespace\":\"minio\",\"name\":\"minio-backup\",\"uid\":\"a355d0fa-cf90-11e9-aaf1-42010aa40066\",\"apiVersion\":\"batch/v1beta1\",\"resourceVersion\":\"35722856\"},\"reason\":\"UnexpectedJob\",\"message\":\"Saw a job that the controller did not create or forgot: test-minio-backup\",\"source\":{\"component\":\"cronjob-controller\"},\"firstTimestamp\":\"2019-09-05T03:55:14Z\",\"lastTimestamp\":\"2019-09-06T10:46:18Z\",\"count\":1373,\"type\":\"Warning\"},\"old_event\":{\"metadata\":{\"name\":\"minio-backup.15c16e910ad17555\",\"namespace\":\"minio\",\"selfLink\":\"/api/v1/namespaces/minio/events/minio-backup.15c16e910ad17555\",\"uid\":\"75b2664e-d08a-11e9-aaf1-42010aa40066\",\"resourceVersion\":\"805295\",\"creationTimestamp\":\"2019-09-06T09:41:11Z\"},\"involvedObject\":{\"kind\":\"CronJob\",\"namespace\":\"minio\",\"name\":\"minio-backup\",\"uid\":\"a355d0fa-cf90-11e9-aaf1-42010aa40066\",\"apiVersion\":\"batch/v1beta1\",\"resourceVersion\":\"35722856\"},\"reason\":\"UnexpectedJob\",\"message\":\"Saw a job that the controller did not create or forgot: test-minio-backup\",\"source\":{\"component\":\"cronjob-controller\"},\"firstTimestamp\":\"2019-09-05T03:55:14Z\",\"lastTimestamp\":\"2019-09-06T10:41:14Z\",\"count\":1355,\"type\":\"Warning\"}}\n","stream":"stdout","time":"2019-09-06T10:46:18.681193448Z"}
Does Loki automatically handle json format or something else still missing?
Hi @minhdanh i just found a solution for eventrouter :)
- match:
selector: '{app="eventrouter"}'
stages:
- json:
expressions:
log:
- json:
source: log
expressions:
event_verb: verb
event:
- json:
source: event
expressions:
event_reason: reason
involvedObject:
source:
- json:
source: involvedObject
expressions:
event_kind: kind
event_namespace: namespace
event_name: name
- json:
source: source
expressions:
event_source_host: host
event_source_component: component
- labels:
event_verb:
event_kind:
event_reason:
event_namespace:
event_name:
event_source_host:
event_source_component:
Hi @Lucaber Thanks for the solution. But looks like it doesn't work for me. I added your snippet to promtail's pipelineStages config: https://github.com/grafana/loki/blob/master/production/helm/promtail/values.yaml#L29
promtail:
pipelineStages:
- match:
selector: '{app="eventrouter"}'
stages:
- json:
expressions:
log:
- json:
source: log
expressions:
event_verb: verb
event:
- json:
source: event
expressions:
event_reason: reason
involvedObject:
source:
- json:
source: involvedObject
expressions:
event_kind: kind
event_namespace: namespace
event_name: name
- json:
source: source
expressions:
event_source_host: host
event_source_component: component
- labels:
event_verb:
event_kind:
event_reason:
event_namespace:
event_name:
event_source_host:
event_source_component:
Then deployed promtail again. But the still the same in Loki.
@minhdanh loki only knows logs as byte arrays for storage, everything is basically a string.
Your log example looks like the output of a docker log line, which has json nested inside json.
I'm not quite sure what you are ultimately looking for in Grafana? But the simplest pipeline config would just include the docker
stage which will unroll the docker json, and set the log json as the log line, which should then be un-esacaped and appear like normal json.
The config @Lucaber pasted is setting a series of labels from the log but is not manipulating the output sent to Loki, you must use an output
pipeline stage for this (the docker stage internally is just a json, timestamp, label, and output stage)
Also @Lucaber I believe you could make your config a little more concise and probably a little faster:
promtail:
pipelineStages:
- match:
selector: '{app="eventrouter"}'
stages:
- docker:
- json:
expressions:
event_verb: verb
event_kind: event.involvedObject.kind
event_reason: event.reason
event_namespace: event.involvedObject.namespace
event_name: event.metadata.name
event_source_host: event.source.host
event_source_component: event.source.component
- labels:
event_verb:
event_kind:
event_reason:
event_namespace:
event_name:
event_source_host:
event_source_component:
If all your logs are docker, you could also move that outside the match:
promtail:
pipelineStages:
- docker:
- match:
selector: '{app="eventrouter"}'
stages:
- json:
...
The advantage of using the docker
stage is that it will set the timestamp from the log line as well as set the output to the un-escaped json of the actual log message
Ohh yes, i was looking for loki labels to easily filter the logs. I previously tried something similar:
- match:
selector: '{app="eventrouter"}'
stages:
- json:
expressions:
event_verb: log.verb
- labels:
event_verb:
I also tried verb
instead of log.verb
but my label was still empty (null
). Maybe the docker
stage does the trick, i will try this again later.
@slim-bean Thank you. Apparently I removed docker: {}
in the pipeline stages and it didn't work. I added it again and it's working with correct json format in Grafana.
I'm not quite sure what you are ultimately looking for in Grafana?
With json supported by Loki I was expecting to search/query the logs using something like object.property=value
. This is possible, right?
Currently no, neither grafana/logql have any higher level support for JSON, if you are using logcli you can use -o raw
and pipe into something like jq
to manipulate json directly. In grafana your option is currently to regex (but this will just match an entire log line).
There are plans to include better handling of JSON in the future but for now all logs are stored and treated the same.
@Lucaber
i just found a solution for eventrouter :)
- match:
Hi. Could you provide your full promtail.yaml (or helm values.yaml) for your eventrouter-promtail-loki solution? That would be great :-)
If we do not have nested JSON objects, can I expect this json log line
{"log":"database hrdb is not running\n","loglevel":"error","time":"2020-01-12T01:11:11.870000000-07.00"}
to be converted to the following format ?
ts output loglevel
===================================================================================
2020-01-12T01:11:11.870000000-07.00 database hrdb is not running\n error
I am using the following config and expecting it to parse the json log line.
- job_name: logjson
static_configs:
- targets:
- localhost
labels:
job: jsonlogs
__path__: /tmp/log.json
pipeline_stages:
- json:
expressions:
output: log
loglevel: loglevel
timestamp: time
- labels:
loglevel:
- timestamp:
source: time
format: RFC3339Nano
This is kind of a deal breaker for us because:
I love the loki design but, same as @DenisBiondic, for us, without dynamic structured logging, which is what json would bring, it's tough for us to use loki.
We are using serilog in our C# stack to log lots of fields, not labels, just fields here and there. Using labels wouldn't work since they are plenty of fields with high cardinality. Each team is responsible of keeping in sync the field/log generation from our code and the queries - basically grafana dashboards with variables. Using regex only would be a huge step backwards the structured logging path we took (and are very happy with).
@DenisBiondic check https://github.com/grafana/loki/pull/1848 and leave us some feedback like @alexvaut this will help our internal discussion with the team.
I stumbled upon this issue, and I'm looking to introduce Loki to my team as well. As with @DenisBiondic we are running mostly structured logging, and I'm not looking forward to doing any sort of regex to find things, that seems like a step backwards.
Other resources I've found were: https://stackoverflow.com/questions/58564836/how-to-promtail-parse-json-to-label-and-timestamp https://github.com/grafana/loki/blob/master/docs/clients/promtail/pipelines.md?ts=4 https://grafana.com/blog/2019/07/25/lokis-path-to-ga-adding-structure-to-unstructured-logs/
All of which point to what feels like needing to know json fields ahead of time, and even converting a json log line back into a "structured text" line.
We're working on solving this via LogQL. You'll be able to select which property at query time you want to show if not all (but that;s hard to read).
Any update here, we are attempting to roll this out company-wide but JSON logging seems to be a blocker.
We've reviewing the final design doc. It's coming !
@cyriltovena any update?
@cyriltovena why is it closed, any resolution here?
@slim-bean any chance to have this reopened? This issue is more about full json support (ingesting logs + querying logs with json support). Only the first part is done with json stage in pipeline and more and more people need this.
Yep Iām working on the implementation ETA observabilityCON.
any update on this ?
Really looking forward to the demo'd feature. Any ETA on release?
For those who end up here, the syntax is
{job="mysql"} | json | line_format "{{.message}}"
message
is a json field.
@joeky888 ? can we parse after that ?
I have shipped the log using fluentbit. The sample logs shipped by fluentbit is as follow
{
"log": "2021-04-16 10:01:29.2037 [INFO] [00000000-0000-0000-0000-000000000000] The very long message",
"stream": "stdout",
"time": "2021-04-16T10:01:29.204350751Z",
"kubernetes": {
"pod_name": "mypodname",
"pod_id": "mypodid",
"host": "mynode",
"container_name": "mycontainer",
"docker_id": "mydockerid",
"container_hash": "mycontainerhash",
"container_image": "mycontainerimage"
}
}
Can I parse the log field using regexp ? So that I can filter later by for example correlationId. I try the following
{namespace=mynamespace} | json log="log" | line_format "{{.log}}" | regexp "^(?P<time>\\S+\\s+\\S+)\\s+\\[(?P<logLevel>\\S+)\\]\\s+\\[(?P<correlationId>\\S*)\\]\\s+(?P<message>.*)$"
But that does not seem to work.
I am expecting the following labels.
time
logLevel
correlationId
message
Hello @alfianabdi your question is out of scope here, you should open a new issue. And your regex group looks wrong.
Also, I am NOT a maintainer here :)
@joeky888 Thanks. I also found the mistake: the ^ and $. If removed it works.
Loki looks very promising! š
Are there any plans to support ingestion of JSON log lines?
It seems to be a pretty common structure for logs these days. Here are some examples (can add more):