trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.49k stars 3.02k forks source link

Severe performance degradation going from Trino 419 to 464 #24136

Open vburenin opened 1 week ago

vburenin commented 1 week ago

Today we have attempted migration from Trino 419 to Trino 464 and we started seeing a few queries failing with memory limit issues:

The query in question is this:

SELECT "foo" FROM
   (WITH om_stores AS
      (SELECT s.some_id,
              COALESCE(BOOL_OR(ent.feature_name = 'test' AND ent.disabled_at IS NULL), False) AS om
       FROM "iceberg"."dbname"."tblname" s
       LEFT JOIN iceberg.dbname2.data_history ent ON s.some_id = ent.some_id
       WHERE s.deleted_at IS NULL GROUP BY s.some_id)
           SELECT CAST(o.day_partition AS DATE) AS analysis_date,
                                   o.server_region,
                                   o.foo_slot AS foo,
                                   o.foo_slot
    FROM om_stores om
    LEFT JOIN iceberg.dbname3.foo_orders o ON om.some_id = o.some_id
    WHERE DATE(o.day_partition) >= current_date - INTERVAL '60' DAY
    GROUP BY o.day_partition, o.server_region, o.foo_slot) AS "virtual_table"
 WHERE "analysis_date" >= DATE '2024-10-14' AND "analysis_date" < DATE '2024-11-14'
 GROUP BY "foo", date_trunc('week', CAST("analysis_date" AS TIMESTAMP))
 LIMIT 1000;

It fails like this:

Query 20241114_213039_05361_smnbd, FAILED, 7 nodes
Splits: 20,928 total, 12,657 done (60.48%)
15.95 [531M rows, 10.7GB] [33.3M rows/s, 686MB/s]

Query 20241114_213039_05361_smnbd failed: Query exceeded distributed user memory limit of 40GB

Changing WHERE DATE(o.day_partition) >= current_date - INTERVAL '60' DAY to WHERE o.day_partition >= '2024-09-15' Immediately returns everything back to normal.

Query 20241114_213936_05525_smnbd, FINISHED, 7 nodes
Splits: 2,189 total, 2,189 done (100.00%)
5.34 [127M rows, 1.52GB] [23.7M rows/s, 292MB/s]

While this query itself is obviously not great, but this is what users came up with and we can't just break their stuff.

Explain shows a lack of missing filter for foo_orders table in Trino 464 while this example is copied from 419: (CAST("day_partition_6" AS date) >= DATE '2024-09-15') AND (CAST("day_partition_6" AS DATE) >= DATE '2024-10-14') AND (CAST("day_partition_6" AS DATE) < DATE '2024-11-14')

All tables are Iceberg tables.

martint commented 1 week ago

See https://github.com/trinodb/trino/pull/22987

lordicecream commented 4 days ago

@vburenin did adding the session property help with the queries? I am also facing severe degradation when upgrading my trino setup from 418 to 464

vburenin commented 4 days ago

@lordicecream yes it did. Did you hit the same problem?

vburenin commented 3 days ago

@lordicecream This config was introduced later, it is not applicable for 422