logdyhq / logdy-core

Web based real-time log viewer. Stream ANY content to a web UI with autogenerated filters. Parse any format with TypeScript.
https://logdy.dev
Apache License 2.0
1.3k stars 34 forks source link

Parse and generate derived columns from column containing JSON #65

Open CAFxX opened 1 week ago

CAFxX commented 1 week ago

I'm following the guide to use logdy with journald logs, but I am running into a seemingly dumb roadblock. My service logs in JSON format, so the MESSAGE field returned by journalctl -o json contains the serialized JSON object emitted by my service as a (correctly escaped) JSON string. In my handler I set isJson:true, but that only prettifies the visualization of the contents of the MESSAGE field.

What I would like, instead, is for the contents of the JSON object emitted by my service to be parsed and extracted into their own columns. In my service I am using log/slog, so some fields are common across all log entries (e.g. msg, source.*, ...) but others vary depending on the message. Unfortunately I can't seem to find any guide to do this.

PeterOsinski commented 1 week ago

Hey @CAFxX sounds like this should be doable, can you paste a sample of the logs or there's no point since its a pure JSON? I tried to reproduce on one of my machines, I did logdy stdin 'journalctl -o json' and opened a UI image I got a single raw column, next I went to settings and clicked auto-generate which gave me this (after that I removed raw column) image Is this what you expect to happen or I'm missing something?

CAFxX commented 1 week ago

Yes that works for me as well, but the problem is that the column MESSAGE contains JSON, because that's what my service generates:

image

in this case, what I would like to do is to extract the fields in that JSON object into their own columns. As mentioned above, some fields like msg, source.*, level, and time will be common to all events (this is the log/slog convention) whereas others (like fetcher.ticker in this example) are only present in some events.

Just FTR, these are my settings for the column MESSAGE (I added the isJson: true):

image
PeterOsinski commented 1 week ago

@CAFxX use middleware for it, here's how I did it image Just replace "{\"blah\":123}" with line.json_content['MESSAGE'] then you'll be able to auto-generate columns as well. Keep in mind you need to fill the ts as well. Let me know if it helped

CAFxX commented 5 days ago

this will overwrite all the journald columns though; is there a way to keep the old columns, and just add the new ones (i.e. those from the MESSAGE column)? Also, how can I handle nested objects (e.g. the source field in my example - that is inserted by log/slog is a nested object with three fields: this is probably easy to handle because we know in advance the names of the fields; other fields instead may have non-constant nested fields)

PeterOsinski commented 3 days ago

It's all possible but requires some tinkering, below is the code for the middleware you can use:

(line: Message): Message | void => {
    // this is where you can parse a serialized JSON
    const parsed = JSON.parse("{\"blah\":123,\"nested\":{\"obj\":100}}")
    return {
        json_content: {
            ...line.json_content, // unwind the current JSON content
            MESSAGE_APP_JSON: parsed.blah, // add extra properties on the top level of json_content
            NESTED_FIELD: parsed.nested.obj // you can access nested fields too
        }
        , is_json: true, ts: 1, log_type: 1, content: null
    };
}

Next you can use these fields to display as columns easily image

CAFxX commented 3 days ago

Thanks, sorry to keep asking but... any suggestion about how to deal with this?

other fields instead may have non-constant nested fields

So e.g. one log event may have { A: { B: true, C: { x: 42, y: -1 } } }, and the next event { D: [10, 20, 30] }. I can not always know the names of those fields, or their types, in advance Ideally I would like to have a way to flatten such objects, e.g. turning the first into something like

A.B: true
A.C.x: 42
A.C.y: -1

and the second one into something like

D/1: 10
D/2: 20
D/3: 30

(these are simple examples, real objects can be more complex)

So I guess the question is: is there a way to flatten an arbitrary object, so that the keys of the result uniquely identify the leaf value (I guess something like jsonpath/jsonpointer), and the values are the ones identified by the key?

PeterOsinski commented 2 days ago

do you mean you would like "flatten" the object but get multiple rows in return?

CAFxX commented 22 hours ago

I mean having each leaf value in the nested object become its own column (so e.g. my first example would have the columns A.B, A.C.x, and A.C.y)

PeterOsinski commented 20 hours ago

Aha, in that case I think you can flatten the object in the middleware using this code

(line: Message): Message | void => {

    function flattenObject(ob) {
        var toReturn = {};

        for (var i in ob) {
            if (!ob.hasOwnProperty(i)) continue;

            if ((typeof ob[i]) == 'object' && ob[i] !== null) {
                var flatObject = flattenObject(ob[i]);
                for (var x in flatObject) {
                    if (!flatObject.hasOwnProperty(x)) continue;

                    toReturn[i + '.' + x] = flatObject[x];
                }
            } else {
                toReturn[i] = ob[i];
            }
        }
        return toReturn;
    }

    line.json_content = {
        ...line.json_content, // unwind the current JSON content
        ...flattenObject({ A: { B: { C: 1, C_nested: "lorem ipsum" }, bar: "baz" }, foo: true }) // add more top-level fields
    }
    return line
}

Then pick the columns in the settings tab to display. However, the columns in the UI are always fixed, you can use additional logic in the middleware to assign specific values to specific columns. I tested this approach on demo.logdy.dev and it worked, see screenshot image