metrico / qryn

⭐️ All-in-One Polyglot Observability with OLAP Storage for Logs, Metrics, Traces & Profiles. Drop-in Grafana Cloud replacement compatible with Loki, Prometheus, Tempo, Pyroscope, Opentelemetry, Datadog and beyond :rocket:
https://qryn.dev
GNU Affero General Public License v3.0
1.24k stars 68 forks source link

Parse json and filter by fields value on clickhouse side #250

Closed R-omk closed 1 year ago

R-omk commented 2 years ago

Do JSONExtractString on samples.string in the same way as on time_series.labels

{namespace="ns1"}   | json  |  field1FromJson="val1" | field2FromJson!~"err"
akvlad commented 2 years ago

@R-omk if you want to use JSONExtractString, please use json with parameters: {namespace="ns1"} | json f1="field1FromJson", f2="field2FromJson" | f1="val1" | f2!~"err"

Parameterless json operator works in a completely different way and cannot be emulated by clickhouse.

akvlad commented 2 years ago

Well, it can be emulated by UDF but UDF seem to never go out from experimental.

R-omk commented 2 years ago

It really worked, but I think I found a bug.

I tried something like this, but the result contains lines that do not contain POST

 | json  my_field="message.request" | my_field=~"POST"   |   line_format "{{.my_field}}"

However, exact match works correctly.

part of raw sql

(isValidJSON(samples.string) = 1) AND ((arrayExists(x -> (((x.1) = 'my_field') AND (extractAllGroups(x.2, '(POST)') != [])), extra_labels) = 0) OR ((arrayExists(x -> ((x.1) = 'my_field'), extra_labels) = 0) AND ((arrayExists(x -> ((x.1) = 'my_field'), labels) = 1) AND (match(arrayFirst(x -> ((x.1) = 'my_field'), labels).2, 'POST') != 0))))
akvlad commented 2 years ago

@R-omk :D I think I switched =~ and !~ Please check !~ "POST"

R-omk commented 2 years ago

yep, that's exactly what it is )

akvlad commented 2 years ago

Thanks for the bug report. I will fix it tomorrow.

R-omk commented 2 years ago

I also watched a crash today during work with json , you can try it yourself,

Tomorrow I'll try to create a separate ticket if you don't do it earlier.

context: | json xxx="message" | line_format "{{.xxx}}" | json | request=~ttt1``

message contains string with valid json with field request env LINE_FMT=go_native

--

I also observe problems with the standard grafana pattern "Log query with parsing of nested json"

{} |= `` | json | __error__=`` | line_format `{{.message}}` | json | __error__=``

looks like field __error__ does't exist

akvlad commented 2 years ago

error doesn't exist. Erroneous lines are omitted. {} empty brackets are not supported as well.

akvlad commented 2 years ago

The fix is tested and merged. @R-omk Please check.

akvlad commented 1 year ago

Issue is closed due to no new questions.