metrico / qryn

⭐️ All-in-One Polyglot Observability with OLAP Storage for Logs, Metrics, Traces & Profiles. Drop-in Grafana Cloud replacement compatible with Loki, Prometheus, Tempo, Pyroscope, Opentelemetry, Datadog and beyond :rocket:
https://qryn.dev
GNU Affero General Public License v3.0
1.24k stars 68 forks source link

log volume with and without json parser #356

Closed arnitolog closed 1 year ago

arnitolog commented 1 year ago

Hello,

I faced a weird issue: the same qryn with the same query returns a different log volume value depending on whether I specified or not json parser. Here is a screenshot:

image

My current deployment: Clickhouse 3 shards with 2 replicas -> chproxy (1 instance) -> qryn (1 instance) I've tried to remove chproxy, but got the same result - different log volume value.

In the qryn logs I see below: for query with json parser:

WITH sel_a AS (select `samples`.`string` as `string`,`samples`.`fingerprint` as `fingerprint`,samples.timestamp_ns as `timestamp_ns`,JSONExtractKeysAndValues(time_series.labels, 'String') as `labels` from cloki.samples_v3_dist as `samples` left any join `cloki`.`time_series` AS time_series on `samples`.`fingerprint` = time_series.fingerprint where (`samples`.`timestamp_ns`   between 1697222530351000000 and 1697308930351000000) and (samples.fingerprint IN (select `sel_1`.`fingerprint` from (select `fingerprint` from `cloki`.`time_series_gin` where ((`key` = 'Product') and (`val` = 'Platform'))) as `sel_1`  inner any  join (select `fingerprint` from `cloki`.`time_series_gin` where ((`key` = 'service') and (`val` = 'platform-users-service'))) as `sel_2` on `sel_1`.`fingerprint` = `sel_2`.`fingerprint`)) order by `timestamp_ns` desc limit 1000) select * from sel_a order by `labels` desc,`timestamp_ns` desc

for query without json parser:

WITH sel_a AS (select `samples`.`string` as `string`,`samples`.`fingerprint` as `fingerprint`,samples.timestamp_ns as `timestamp_ns` from cloki.samples_v3_dist as `samples` where (`samples`.`timestamp_ns`   between 1697222530351000000 and 1697308930351000000) and (samples.fingerprint IN (select `sel_1`.`fingerprint` from (select `fingerprint` from `cloki`.`time_series_gin` where ((`key` = 'Product') and (`val` = 'Platform'))) as `sel_1`  inner any  join (select `fingerprint` from `cloki`.`time_series_gin` where ((`key` = 'service') and (`val` = 'platform-users-service'))) as `sel_2` on `sel_1`.`fingerprint` = `sel_2`.`fingerprint`)) order by `timestamp_ns` desc limit 1000) select JSONExtractKeysAndValues(time_series.labels, 'String') as `labels`,sel_a.* from sel_a left any join `cloki`.`time_series_dist` AS time_series on `sel_a`.`fingerprint` = time_series.fingerprint order by `labels` desc,`timestamp_ns` desc
akvlad commented 1 year ago

@arnitolog v2.4.3 should fix the case. Please check.

arnitolog commented 1 year ago

@akvlad thanks a lot. 2.4.3 works perfectly!