Trino version is 406.
I am running performance testing by Jemeter on one simple query on one cluster with 18 workers. Below is my query :SELECT "td" AS "td",
date_trunc('day', CAST(due_day_local AS TIMESTAMP)) AS "due_day_local",
count(DISTINCT "so_id") AS "COUNT_DISTINCT(so_id)" FROM xxxx
WHERE "due_day" >= from_iso8601_date('2023-05-23')
AND "due_day" <= from_iso8601_date('2023-05-30')
AND "td_date" IS null
AND "td_location" = 'xxxxe'
GROUP BY "td", date_trunc('day', CAST(due_day_local AS TIMESTAMP))
ORDER BY "COUNT_DISTINCT(so_id)" DESC
LIMIT 10000 (edited)
11:27
Below is config:coordinator=true
node-scheduler.include-coordinator=false
node-scheduler.max-splits-per-node=200
node-scheduler.max-pending-splits-per-task=20
query.max-stage-count=400
query.max-length=65432
query.stage-count-warning-threshold=400
query.max-memory=120GB
query.max-memory-per-node=10GB
exchange.http-client.request-timeout=120s
exchange.client-threads=40
query.max-run-time=400s
scheduler.http-client.max-requests-queued-per-destination=4096
query.max-history=200
query.min-expire-age=30m
http-server.log.max-size=67108864B
http-server.log.max-history=5
11:27
Resource group for this user is {
"name": "xxxx",
"softMemoryLimit": "95%",
"hardConcurrencyLimit":50,
"schedulingWeight": 2000,
"maxQueued": 220
},
11:29
why only like 4 queries are running and many queries are in queue. I found that Analysis Time and Planning Time are like 2-3 seconds.
How to improve the performance on it? is there any thing wrong with my config? is it possible to reduce the Analysis Time and Planning Time. By the way I am running the hudi query by Hudi connector
Trino version is 406. I am running performance testing by Jemeter on one simple query on one cluster with 18 workers. Below is my query :SELECT "td" AS "td", date_trunc('day', CAST(due_day_local AS TIMESTAMP)) AS "due_day_local", count(DISTINCT "so_id") AS "COUNT_DISTINCT(so_id)" FROM xxxx WHERE "due_day" >= from_iso8601_date('2023-05-23') AND "due_day" <= from_iso8601_date('2023-05-30') AND "td_date" IS null AND "td_location" = 'xxxxe' GROUP BY "td", date_trunc('day', CAST(due_day_local AS TIMESTAMP)) ORDER BY "COUNT_DISTINCT(so_id)" DESC LIMIT 10000 (edited) 11:27 Below is config:coordinator=true node-scheduler.include-coordinator=false node-scheduler.max-splits-per-node=200 node-scheduler.max-pending-splits-per-task=20 query.max-stage-count=400 query.max-length=65432 query.stage-count-warning-threshold=400 query.max-memory=120GB query.max-memory-per-node=10GB exchange.http-client.request-timeout=120s exchange.client-threads=40 query.max-run-time=400s scheduler.http-client.max-requests-queued-per-destination=4096 query.max-history=200 query.min-expire-age=30m http-server.log.max-size=67108864B http-server.log.max-history=5 11:27 Resource group for this user is { "name": "xxxx", "softMemoryLimit": "95%", "hardConcurrencyLimit":50, "schedulingWeight": 2000, "maxQueued": 220 }, 11:29 why only like 4 queries are running and many queries are in queue. I found that Analysis Time and Planning Time are like 2-3 seconds.
How to improve the performance on it? is there any thing wrong with my config? is it possible to reduce the Analysis Time and Planning Time. By the way I am running the hudi query by Hudi connector