Open tsearle opened 1 month ago
Hello @tsearle . Let's run a couple of helping requests in order to understand where the bottleneck is:
select count(), uniq(fingerprint) from (SELECT fingerprint
FROM time_series_gin AS time_series_gin
WHERE ((key = 'name') AND (val = 'kamailio_app_server_dead')) AND (date >= toDate(fromUnixTimestamp(1727294240))) AND (date <= toDate(fromUnixTimestamp(1727294540))) AND (type IN (0, 0))
GROUP BY fingerprint
HAVING groupBitOr(bitShiftLeft(CAST((key = 'name') AND (val = 'kamailio_app_server_dead'), 'UInt64'), 0)) = 1)
WITH
idx AS
(
SELECT fingerprint
FROM time_series_gin AS time_series_gin
WHERE ((key = 'name') AND (val = 'kamailio_app_server_dead')) AND (date >= toDate(fromUnixTimestamp(1727294240))) AND (date <= toDate(fromUnixTimestamp(1727294540))) AND (type IN (0, 0))
GROUP BY fingerprint
HAVING groupBitOr(bitShiftLeft(CAST((key = 'name') AND (val = 'kamailio_app_server_dead'), 'UInt64'), 0)) = 1
),
raw AS
(
SELECT
argMaxMerge(last) AS value,
fingerprint,
intDiv(timestamp_ns, 15000000000) * 15000 AS timestamp_ms
FROM metrics_15s AS metrics_15s
WHERE (fingerprint IN (idx)) AND (timestamp_ns >= 1727294240000000000) AND (timestamp_ns <= 1727294540000000000) AND (type IN (0, 0))
GROUP BY
fingerprint,
timestamp_ms
ORDER BY
fingerprint ASC,
timestamp_ms ASC
)
select count() from raw;
Please share the results of both of them. There should be a set of numbers about the overall cardinality you have.
first query count() 0 uniq(fingerprint) 0
2nd query count() 0
are the queries correct?
@tsearle please try doing USE qryn;
before queries.
to my knowledge this with use qryn (the qryn db is selected)
explicitly adding "use qryn" doesn't change the result
@tsearle in this case the request you have posted should use 0MB of RAM and finish in almost no time. There's no way it is the cause.
Can you please recheck the full request in the issue description manually?
@tsearle I have just understood what's wrong. It's Github markdown. Please try These requests instead.
select count(), uniq(fingerprint) from (SELECT fingerprint
FROM time_series_gin AS time_series_gin
WHERE ((key = '__name__') AND (val = 'kamailio_app_server_dead')) AND (date >= toDate(fromUnixTimestamp(1727294240))) AND (date <= toDate(fromUnixTimestamp(1727294540))) AND (type IN (0, 0))
GROUP BY fingerprint
HAVING groupBitOr(bitShiftLeft(CAST((key = '__name__') AND (val = 'kamailio_app_server_dead'), 'UInt64'), 0)) = 1)
WITH
idx AS
(
SELECT fingerprint
FROM time_series_gin AS time_series_gin
WHERE ((key = '__name__') AND (val = 'kamailio_app_server_dead')) AND (date >= toDate(fromUnixTimestamp(1727294240))) AND (date <= toDate(fromUnixTimestamp(1727294540))) AND (type IN (0, 0))
GROUP BY fingerprint
HAVING groupBitOr(bitShiftLeft(CAST((key = '__name__') AND (val = 'kamailio_app_server_dead'), 'UInt64'), 0)) = 1
),
raw AS
(
SELECT
argMaxMerge(last) AS value,
fingerprint,
intDiv(timestamp_ns, 15000000000) * 15000 AS timestamp_ms
FROM metrics_15s AS metrics_15s
WHERE (fingerprint IN (idx)) AND (timestamp_ns >= 1727294240000000000) AND (timestamp_ns <= 1727294540000000000) AND (type IN (0, 0))
GROUP BY
fingerprint,
timestamp_ms
ORDER BY
fingerprint ASC,
timestamp_ms ASC
)
select count() from raw;
Kind request for the future to use triple backtics when you paste the request so it's considered "code". Github processor eats parts of text like these double underscores.
Sorry for the inconvenience.
@akvlad attached results ![Uploading Screenshot 2024-09-30 at 11.07.25.png…]()
@gabrielgontariu @tsearle the screenshot didn't work unfortunately
Qryn threw the following error Code: 241. DB::Exception: Memory limit (total) exceeded: would use 7.21 GiB (attempt to allocate chunk of 4476253 bytes), maximum: 7.20 GiB. OvercommitTracker decision: Query was selected to stop by OvercommitTracker.: (while reading column labels): (while reading from part data/3bebf7bf-6abc-4a92-ae16-834c90cbafbc/20240925_0_24268_13230/ in table qryn.time_series (3bebf7bf-6abc-4a92-ae16-834c90cbafbc) located on disk s3WithKeeperDisk of type s3, from mark 0 with max_rows_to_read = 8192): While executing MergeTreeSelect(pool: PrefetchedReadPool, algorithm: Thread). (MEMORY_LIMIT_EXCEEDED) (version 24.5.1.22957 (official build))
When digging on the clickhouse console, the query that failed appears to be... WITH idx AS (select
fingerprint
fromqryn
.time_series_gin
astime_series_gin
where ((((key
= 'name') and (val
= 'kamailio_app_server_dead'))) and (date
>= toDate(fromUnixTimestamp(1727294240))) and (date
<= toDate(fromUnixTimestamp(1727294540))) and (type
in (0,0))) group byfingerprint
having (groupBitOr(bitShiftLeft(((key
= 'name') and (val
= 'kamailio_app_server_dead'))::UInt64, 0)) = 1)), raw AS (select argMaxMerge(last) asvalue
,fingerprint
,intDiv(timestamp_ns, 15000000000) * 15000 astimestamp_ms
frommetrics_15s
asmetrics_15s
where ((fingerprint
in (idx)) and (timestamp_ns
>= 1727294240000000000) and (timestamp_ns
<= 1727294540000000000) and (type
in (0,0))) group byfingerprint
,timestamp_ms
order byfingerprint
,timestamp_ms
), timeSeries AS (selectfingerprint
,arraySort(JSONExtractKeysAndValues(labels, 'String')) aslabels
fromqryn
.time_series
where ((fingerprint
in (idx)) and (type
in (0,0)))) select any(labels) asstream
,arraySort(groupArray((raw.timestamp_ms, raw.value))) asvalues
from raw asraw
any left join timeSeries as time_series ontime_series
.fingerprint
= raw.fingerprint group byraw
.fingerprint
order byraw
.fingerprint
FORMAT RowBinaryThis caused the prometheus query (for alarm monitoring) to fail: kamailio_app_server_dead{}
Am I doing something stupidly wrong?