databendlabs / databend

๐——๐—ฎ๐˜๐—ฎ, ๐—”๐—ป๐—ฎ๐—น๐˜†๐˜๐—ถ๐—ฐ๐˜€ & ๐—”๐—œ. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
https://docs.databend.com
Other
7.87k stars 751 forks source link

Feature: support spilling large materialized CTEs to temporary tables #16858

Open maxjustus opened 5 days ago

maxjustus commented 5 days ago

Currently materialized CTEs reside fully in memory with no disk spilling capability. This means that if the CTE youโ€™re materializing is larger than your configured query memory limit you will get a memory limit exceeded error.

The only current option to work around this while retaining the performance benefits of materialized CTEs is to pull the CTE into a create transient table โ€ฆ which is created before you run the query.

It would be ideal to have a setting that specifies a memory ratio threshold above which materialized CTEs are converted into temporary tables.

It could be called something like materialized_cte_spilling_memory_ratio to be consistent with the existing aggregate_spilling_memory_ratio and join_spilling_memory_ratio settings.