metrico / qryn

Polyglot Observability Stack. Lightweight & Drop-in compatible with Loki, Prometheus, Tempo, Pyroscope, Opentelemetry, Datadog & more! WASM powered ⭐️ Star to Support
https://qryn.dev
GNU Affero General Public License v3.0
1.05k stars 63 forks source link

[Feature Request] "Native" parsing of logfmt #448

Open akvlad opened 5 months ago

akvlad commented 5 months ago

Current state

Currently logfmt is parsed inside the JS application. If the |logfmt pipe is encountered in the script, the raw information is dowmloaded from the clickhouse server and the further processing is done on the JS side.

It is incredibly slow and inefficient.

It limits the ability to support some complicated scripts like the one represented in https://github.com/metrico/qryn/issues/444 : topk(10, count(count_over_time({kind="exception", app="$app"} | logfmt [$__range])) by (value))

On the other hand the new clickhouse versions have a function providing the mechanism of "native" logfmt parsing: https://clickhouse.com/docs/en/sql-reference/functions/tuple-map-functions#extractkeyvaluepairs

Requirements

  1. Append the initial clickhouse capabilities checks to understand if the function is supported
  2. If it is supported, then use the function to parse the logfmt pipe "natively"
  3. If it is not supported, then use the legacy JS processor.

Useful Links

https://altinity.com/blog/boosting-performance-and-flexibility-of-clickhouse-key-value-pair-extraction