Closed forsaken628 closed 4 weeks ago
Enhanced Data Configuration
The team has updated the configurations for a component called SpillerConfig
. It's now using a technology called parquet
that helps in storing and retrieving data more efficiently.
Revamped Data Spilling Method The way our software handles 'spilling' (overflow of data from main memory to a backup storage) has been improved. It's now dealing with multiple blocks of data at once (vectorized spill), instead of one at a time. This should speed up any operations involving large amounts of data.
Added Serialization Capabilities
We've added a new feature via the serialize.rs
module. This allows our software to convert data blocks into Parquet
format and vice versa. Parquet is great for efficiently handling large volumes of data, making operations faster and less resource-intensive.
Improved Data Management
We have made significant improvements to the WindowPartitionBuffer
, which manages overflowed data partitions. This should lead to a better management and organization of such data, helping operations run more smoothly.
Flexible Spilling Option
A new setting (spilling_use_parquet
) has been introduced that allows the choice between using Parquet
or Arrow IPC
format for data spilling. This flexibility means we can choose the format that best suits our current needs, optimizing performance and resource use.
Optimized Data Operations By refining the data handling functions, we have made data spill operations more efficient and clearly defined. This makes the code easier to maintain and could lead to quicker development in the future.
Benchmark:
dataset: tpch sf100
settings:
set max_memory_usage = 16*1024*1024*1024;
set window_partition_spilling_memory_ratio = 30;
set window_partition_spilling_to_disk_bytes_limit = 30*1024*1024*1024;
sql
EXPLAIN ANALYZE SELECT
l_orderkey,
l_partkey,
l_quantity,
l_extendedprice,
l_shipinstruct,
l_shipmode,
ROW_NUMBER() OVER (PARTITION BY l_orderkey ORDER BY l_extendedprice DESC) AS row_num,
RANK() OVER (PARTITION BY l_orderkey ORDER BY l_extendedprice DESC) AS rank_num
FROM
lineitem ignore_result;
set spilling_use_parquet = 0;
โโโ estimated rows: 600037902.00
โโโ cpu time: 651.285131424s
โโโ wait time: 168.630024827s
โโโ output rows: 600.04 million
โโโ output bytes: 44.87 GiB
โโโ numbers local spilled by write: 208
โโโ bytes local spilled by write: 15.06 GiB
โโโ local spilled time by write: 136.856s
โโโ numbers local spilled by read: 3072
โโโ bytes local spilled by read: 15.06 GiB
โโโ local spilled time by read: 31.933s
set spilling_use_parquet = 1;
โโโ estimated rows: 600037902.00
โโโ cpu time: 848.406496078s
โโโ wait time: 73.858260885s
โโโ output rows: 600.04 million
โโโ output bytes: 44.87 GiB
โโโ numbers local spilled by write: 208
โโโ bytes local spilled by write: 9.56 GiB
โโโ local spilled time by write: 55.665s
โโโ numbers local spilled by read: 3072
โโโ bytes local spilled by read: 9.56 GiB
โโโ local spilled time by read: 17.512s
Compared with arrow ipc, the optimization of parquet's file size mainly comes from dictionary encoding. parquet's cpu usage is quite high at the same time. There is no significant advantage for highly discrete data.
pr-16612-3f8af35-1729002801
note: this image tag is only available for internal use, please check the internal doc for more details.
LGTM, need rebase.
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
Support use
parquet
format when spilling, you can switch to arrow ipc viaset spilling_file_format = 'arrow'
.Tests
Type of change
This change isโ