PostgreSQL 14.4 (Cloudberry Database 1.0.0+344efe8 build dev) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1 20231218 (Red Hat 11.4.1-3), 64-bit compi
led on Jul 2 2024 11:37:20
and
PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC)
8.5.0 20210514 (Red Hat 8.5.0-18), 64-bit compiled on Jan 19 2024 06:49:03 Bhuvnesh C.
(1 row)
What happened
greenplum 7.1 is faster than cloudberry here are the explain analyze:
### GREENPLUM
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------
Gather Motion 4:1 (slice1; segments: 4) (cost=0.00..32216.60 rows=2068960 width=16) (actual time=242900.739..242900.753 rows=62 loops=1)
Merge Key: (count(eft_incident_base_data.op_id))
-> Sort (cost=0.00..32105.87 rows=517240 width=16) (actual time=53786.687..53786.704 rows=22 loops=1)
Sort Key: (count(eft_incident_base_data.op_id)) DESC
Sort Method: quicksort Memory: 101kB
Executor Memory: 103kB Segments: 4 Max: 26kB (segment 0)
-> Result (cost=0.00..31215.23 rows=517240 width=16) (actual time=37911.378..53786.662 rows=22 loops=1)
Filter: ((count(eft_incident_base_data.op_id)) > 1000000)
-> HashAggregate (cost=0.00..31172.69 rows=1293100 width=24) (actual time=168067.804..242845.712 rows=1294322 loops=1)
Group Key: eft_incident_base_data.op_id
Planned Partitions: 8
Extra Text: (seg0) hash table(s): 1; 1291212 groups total in 40 batches, 1304322200 spill partitions; disk usage: 4472192KB; chain len
gth 2.3 avg, 10 max; using 1291212 of 10747904 buckets; total 0 expansions.
-> Hash Left Join (cost=0.00..24252.08 rows=56266252 width=8) (actual time=2518.693..70212.733 rows=945424552 loops=1)
Hash Cond: (eft_incident_base_data.op_id = edt_e3id_action.op_id)
Extra Text: (seg3) Initial batch 0:
(seg3) Wrote 282618K bytes to inner workfile.
(seg3) Wrote 151836K bytes to outer workfile.
(seg3) Initial batches 1..31:
(seg3) Read 282618K bytes from inner workfile: 9117K avg x 31 nonempty batches, 9360K max.
(seg3) Read 151836K bytes from outer workfile: 4898K avg x 31 nonempty batches, 58338K max.
(seg3) Work file set: 62 files (0 compressed), avg file size 7160336, compression buffer size 0 bytes
(seg3) Hash chain length 17.3 avg, 1609 max, using 614923 of 4194304 buckets.
-> Hash Left Join (cost=0.00..12616.67 rows=10130971 width=8) (actual time=879.597..4333.434 rows=5654063 loops=1)
Hash Cond: (eft_incident_base_data.op_id = edt_incident_feedback.op_id)
Extra Text: (seg3) Initial batch 0:
(seg3) Wrote 24512K bytes to inner workfile.
(seg3) Wrote 33652K bytes to outer workfile.
(seg3) Initial batches 1..3:
(seg3) Read 24512K bytes from inner workfile: 8171K avg x 3 nonempty batches, 8201K max.
(seg3) Read 33652K bytes from outer workfile: 11218K avg x 3 nonempty batches, 11455K max.
(seg3) Work file set: 6 files (0 compressed), avg file size 9912320, compression buffer size 0 bytes
(seg3) Hash chain length 3.3 avg, 277 max, using 356840 of 524288 buckets.
-> Hash Left Join (cost=0.00..10222.77 rows=7080984 width=8) (actual time=688.712..3207.168 rows=1638919 loops=1)
Hash Cond: (eft_incident_base_data.op_id = edt_incident_times.op_id)
Extra Text: (seg3) Initial batch 0:
(seg3) Wrote 26512K bytes to inner workfile.
(seg3) Wrote 33652K bytes to outer workfile.
(seg3) Initial batches 1..3:
(seg3) Read 26512K bytes from inner workfile: 8838K avg x 3 nonempty batches, 8847K max.
(seg3) Read 33652K bytes from outer workfile: 11218K avg x 3 nonempty batches, 11455K max.
(seg3) Work file set: 6 files (0 compressed), avg file size 10250922, compression buffer size 0 bytes
(seg3) Hash chain length 2.7 avg, 13 max, using 480135 of 524288 buckets.
-> Hash Left Join (cost=0.00..7174.00 rows=4039919 width=8) (actual time=481.611..2322.945 rows=1638919 loops=1)
Hash Cond: ((edt_incident_resource_base_data.op_id = edt_incident_resource_times.op_id) AND (edt_incident_resour
ce_base_data.dl_id = edt_incident_resource_times.dl_id))
Extra Text: (seg0) Initial batch 0:
(seg0) Wrote 24739K bytes to inner workfile.
(seg0) Wrote 31191K bytes to outer workfile.
(seg0) Initial batches 1..3:
(seg0) Read 24739K bytes from inner workfile: 8247K avg x 3 nonempty batches, 8261K max.
(seg0) Read 31191K bytes from outer workfile: 10397K avg x 3 nonempty batches, 10412K max.
(seg0) Work file set: 6 files (0 compressed), avg file size 9535488, compression buffer size 0 bytes
(seg0) Hash chain length 2.1 avg, 11 max, using 436893 of 524288 buckets.
-> Hash Left Join (cost=0.00..4409.71 rows=3314379 width=24) (actual time=290.203..1599.308 rows=1638919 loops
=1)
Hash Cond: (eft_incident_base_data.op_id = edt_incident_resource_base_data.op_id)
Extra Text: (seg0) Initial batch 0:
(seg0) Wrote 24857K bytes to inner workfile.
(seg0) Wrote 26641K bytes to outer workfile.
(seg0) Initial batches 1..3:
(seg0) Read 24857K bytes from inner workfile: 8286K avg x 3 nonempty batches, 8578K max.
(seg0) Read 26641K bytes from outer workfile: 8881K avg x 3 nonempty batches, 8888K max.
(seg0) Work file set: 6 files (0 compressed), avg file size 8776362, compression buffer size 0 bytes
(seg0) Hash chain length 2.6 avg, 10003 max, using 364608 of 524288 buckets.
-> Hash Left Join (cost=0.00..2674.14 rows=2060497 width=8) (actual time=106.685..863.709 rows=1302387 l
oops=1)
Hash Cond: (eft_incident_base_data.op_id = edt_incident_patients.op_id)
Extra Text: (seg2) Initial batch 0:
(seg2) Wrote 8486K bytes to inner workfile.
(seg2) Wrote 17679K bytes to outer workfile.
(seg2) Initial batch 1:
(seg2) Read 8486K bytes from inner workfile.
(seg2) Read 17679K bytes from outer workfile.
(seg2) Work file set: 2 files (0 compressed), avg file size 13385728, compression buffer size 0 bytes
(seg2) Hash chain length 2.6 avg, 79 max, using 236867 of 262144 buckets.
-> Hash Left Join (cost=0.00..1683.40 rows=1293100 width=8) (actual time=0.585..249.938 rows=12943
65 loops=1)
Hash Cond: (eft_incident_base_data.op_id = edt_incident_modules.op_id)
Extra Text: (seg3) Hash chain length 1.2 avg, 6 max, using 221 of 65536 buckets.
-> Seq Scan on eft_incident_base_data (cost=0.00..965.83 rows=1293100 width=8) (actual time=
0.248..105.290 rows=1294322 loops=1)
-> Hash (cost=431.01..431.01 rows=265 width=8) (actual time=0.139..0.140 rows=280 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 523kB
-> Seq Scan on edt_incident_modules (cost=0.00..431.01 rows=265 width=8) (actual time=
0.095..0.114 rows=280 loops=1)
-> Hash (cost=518.03..518.03 rows=620533 width=8) (actual time=105.424..105.424 rows=621323 loops=
1)
Buckets: 131072 Batches: 2 Memory Usage: 13180kB
-> Seq Scan on edt_incident_patients (cost=0.00..518.03 rows=620533 width=8) (actual time=0.
193..46.634 rows=621323 loops=1)
-> Hash (cost=629.79..629.79 rows=931556 width=16) (actual time=183.947..183.948 rows=937929 loops=1)
Buckets: 131072 Batches: 4 Memory Usage: 11949kB
-> Seq Scan on edt_incident_resource_base_data (cost=0.00..629.79 rows=931556 width=16) (actual ti
me=0.263..91.094 rows=937929 loops=1)
-> Hash (cost=898.27..898.27 rows=931556 width=16) (actual time=192.700..192.700 rows=937929 loops=1)
Buckets: 131072 Batches: 4 Memory Usage: 12006kB
-> Seq Scan on edt_incident_resource_times (cost=0.00..898.27 rows=931556 width=16) (actual time=0.480..
92.090 rows=937929 loops=1)
-> Hash (cost=1745.31..1745.31 rows=1293100 width=8) (actual time=206.985..206.986 rows=1294322 loops=1)
Buckets: 131072 Batches: 4 Memory Usage: 13710kB
-> Seq Scan on edt_incident_times (cost=0.00..1745.31 rows=1293100 width=8) (actual time=0.661..95.720 rows=12
94322 loops=1)
-> Hash (cost=491.92..491.92 rows=1191101 width=8) (actual time=190.770..190.771 rows=1195129 loops=1)
Buckets: 131072 Batches: 4 Memory Usage: 12692kB
-> Seq Scan on edt_incident_feedback (cost=0.00..491.92 rows=1191101 width=8) (actual time=0.169..89.582 rows=119512
9 loops=1)
-> Hash (cost=1884.51..1884.51 rows=10656262 width=8) (actual time=1638.647..1638.648 rows=10666092 loops=1)
Buckets: 131072 Batches: 32 Memory Usage: 14056kB
-> Seq Scan on edt_e3id_action (cost=0.00..1884.51 rows=10656262 width=8) (actual time=0.226..745.003 rows=10666092 loops=
1)
Optimizer: GPORCA
Planning Time: 155.110 ms
(slice0) Executor memory: 6297K bytes.
* (slice1) Executor memory: 88801K bytes avg x 4 workers, 88858K bytes max (seg1). Work_mem: 22801K bytes max, 417669K bytes wanted.
Memory used: 131072kB
Memory wanted: 3760713kB
Execution Time: 244109.319 ms
(105 rows)
Time: 244266.772 ms (04:04.267)
ebi_dwh=# show optimizer_enable_groupagg;
optimizer_enable_groupagg
---------------------------
off
(1 row)
Time: 0.184 ms
ebi_dwh=# select version();
version
-------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------
PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC)
8.5.0 20210514 (Red Hat 8.5.0-18), 64-bit compiled on Jan 19 2024 06:49:03 Bhuvnesh C.
(1 row)
### CLOUDBERRY
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------
Gather Motion 4:1 (slice1; segments: 4) (cost=0.00..33933.28 rows=2068960 width=16) (actual time=278327.276..278327.293 rows=62 loops=1)
Merge Key: (count(eft_incident_base_data.op_id))
-> Sort (cost=0.00..33822.54 rows=517240 width=16) (actual time=65899.950..65899.969 rows=22 loops=1)
Sort Key: (count(eft_incident_base_data.op_id)) DESC
Sort Method: quicksort Memory: 101kB
-> Result (cost=0.00..32931.90 rows=517240 width=16) (actual time=45214.428..65899.902 rows=22 loops=1)
Filter: ((count(eft_incident_base_data.op_id)) > 1000000)
-> HashAggregate (cost=0.00..32889.36 rows=1293100 width=24) (actual time=186033.702..278274.606 rows=1294322 loops=1)
Group Key: eft_incident_base_data.op_id
Planned Partitions: 8
-> Hash Left Join (cost=0.00..26482.75 rows=52076798 width=8) (actual time=3088.688..77238.238 rows=945424552 loops=1)
Hash Cond: (eft_incident_base_data.op_id = edt_e3id_action.op_id)
Extra Text: (seg3) Initial batch 0:
(seg3) Wrote 282080K bytes to inner workfile.
(seg3) Wrote 151456K bytes to outer workfile.
(seg3) Initial batches 1..31:
(seg3) Read 282618K bytes from inner workfile: 9117K avg x 31 nonempty batches, 9360K max.
(seg3) Read 151836K bytes from outer workfile: 4898K avg x 31 nonempty batches, 58338K max.
(seg3) Hash chain length 17.3 avg, 1609 max, using 614923 of 4194304 buckets.
-> Hash Left Join (cost=0.00..13333.15 rows=14034085 width=8) (actual time=564.574..8352.059 rows=5654063 loops=1)
Hash Cond: (eft_incident_base_data.op_id = edt_incident_times.op_id)
Extra Text: (seg3) Initial batch 0:
(seg3) Wrote 26464K bytes to inner workfile.
(seg3) Wrote 127136K bytes to outer workfile.
(seg3) Initial batches 1..3:
(seg3) Read 26512K bytes from inner workfile: 8838K avg x 3 nonempty batches, 8847K max.
(seg3) Read 127202K bytes from outer workfile: 42401K avg x 3 nonempty batches, 79218K max.
(seg3) Hash chain length 2.7 avg, 13 max, using 480135 of 524288 buckets.
-> Hash Left Join (cost=0.00..9293.84 rows=8494151 width=8) (actual time=329.747..6405.692 rows=5654063 loops=1)
Hash Cond: (eft_incident_base_data.op_id = edt_incident_feedback.op_id)
Extra Text: (seg3) Initial batch 0:
(seg3) Wrote 24480K bytes to inner workfile.
(seg3) Wrote 33600K bytes to outer workfile.
(seg3) Initial batches 1..3:
(seg3) Read 24512K bytes from inner workfile: 8171K avg x 3 nonempty batches, 8201K max.
(seg3) Read 33652K bytes from outer workfile: 11218K avg x 3 nonempty batches, 11455K max.
(seg3) Hash chain length 3.3 avg, 277 max, using 356840 of 524288 buckets.
-> Hash Left Join (cost=0.00..7312.96 rows=5029541 width=8) (actual time=113.453..5140.761 rows=1638919 loops=1)
Hash Cond: (eft_incident_base_data.op_id = edt_incident_patients.op_id)
Extra Text: (seg2) Initial batch 0:
(seg2) Wrote 8480K bytes to inner workfile.
(seg2) Wrote 21792K bytes to outer workfile.
(seg2) Initial batch 1:
(seg2) Read 8486K bytes from inner workfile.
(seg2) Read 21805K bytes from outer workfile.
(seg2) Hash chain length 2.6 avg, 79 max, using 236867 of 262144 buckets.
-> Redistribute Motion 4:4 (slice2; segments: 4) (cost=0.00..5901.13 rows=3346957 width=8) (actual time=0.027
..4842.748 rows=1604777 loops=1)
Hash Key: eft_incident_base_data.op_id
-> Hash Left Join (cost=0.00..5817.32 rows=3346957 width=8) (actual time=333.481..1608.450 rows=3607553
loops=1)
Hash Cond: ((edt_incident_resource_base_data.op_id = edt_incident_resource_times.op_id) AND (edt_inc
ident_resource_base_data.dl_id = edt_incident_resource_times.dl_id))
Extra Text: (seg0) Initial batch 0:
(seg0) Wrote 24704K bytes to inner workfile.
(seg0) Wrote 30208K bytes to outer workfile.
(seg0) Initial batches 1..3:
(seg0) Read 24739K bytes from inner workfile: 8247K avg x 3 nonempty batches, 8261K max.
(seg0) Read 30240K bytes from outer workfile: 10080K avg x 3 nonempty batches, 10099K max.
(seg0) Hash chain length 2.1 avg, 11 max, using 436893 of 524288 buckets.
-> Redistribute Motion 4:4 (slice3; segments: 4) (cost=0.00..3389.09 rows=2440184 width=24) (actu
al time=0.532..449.382 rows=3607553 loops=1)
Hash Key: edt_incident_resource_base_data.op_id
-> Hash Left Join (cost=0.00..3205.78 rows=2440184 width=24) (actual time=334.661..1304.563
rows=1604777 loops=1)
Hash Cond: (eft_incident_base_data.op_id = edt_incident_resource_base_data.op_id)
Extra Text: (seg0) Initial batch 0:
(seg0) Wrote 24832K bytes to inner workfile.
(seg0) Wrote 26400K bytes to outer workfile.
(seg0) Initial batches 1..3:
(seg0) Read 24857K bytes from inner workfile: 8286K avg x 3 nonempty batches, 8578K max.
(seg0) Read 26476K bytes from outer workfile: 8826K avg x 3 nonempty batches, 8831K max.
(seg0) Hash chain length 2.6 avg, 10003 max, using 364608 of 524288 buckets.
-> Hash Left Join (cost=0.00..1681.27 rows=1293100 width=8) (actual time=0.752..291.63
5 rows=1294365 loops=1)
Hash Cond: (eft_incident_base_data.op_id = edt_incident_modules.op_id)
Extra Text: (seg3) Hash chain length 1.2 avg, 6 max, using 221 of 65536 buckets.
-> Seq Scan on eft_incident_base_data (cost=0.00..963.69 rows=1293100 width=8) (
actual time=0.265..114.597 rows=1294322 loops=1)
-> Hash (cost=431.01..431.01 rows=265 width=8) (actual time=0.287..0.290 rows=28
0 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 523kB
-> Seq Scan on edt_incident_modules (cost=0.00..431.01 rows=265 width=8) (
actual time=0.216..0.253 rows=280 loops=1)
-> Hash (cost=629.28..629.28 rows=931556 width=16) (actual time=333.320..333.320 rows=
937929 loops=1)
Buckets: 131072 Batches: 4 Memory Usage: 11949kB
-> Seq Scan on edt_incident_resource_base_data (cost=0.00..629.28 rows=931556 wi
dth=16) (actual time=0.703..211.313 rows=937929 loops=1)
-> Hash (cost=898.27..898.27 rows=931556 width=16) (actual time=332.451..332.454 rows=937929 loops
=1)
Buckets: 131072 Batches: 4 Memory Usage: 12006kB
-> Seq Scan on edt_incident_resource_times (cost=0.00..898.27 rows=931556 width=16) (actual
time=0.827..187.077 rows=937929 loops=1)
-> Hash (cost=489.02..489.02 rows=620533 width=8) (actual time=113.713..113.715 rows=621323 loops=1)
Buckets: 131072 Batches: 2 Memory Usage: 13180kB
-> Seq Scan on edt_incident_patients (cost=0.00..489.02 rows=620533 width=8) (actual time=0.188..51.142
rows=621323 loops=1)
-> Hash (cost=491.27..491.27 rows=1191101 width=8) (actual time=216.181..216.182 rows=1195129 loops=1)
Buckets: 131072 Batches: 4 Memory Usage: 12692kB
-> Seq Scan on edt_incident_feedback (cost=0.00..491.27 rows=1191101 width=8) (actual time=0.167..98.239 rows=
1195129 loops=1)
-> Hash (cost=1745.31..1745.31 rows=1293100 width=8) (actual time=234.717..234.718 rows=1294322 loops=1)
Buckets: 131072 Batches: 4 Memory Usage: 13710kB
-> Seq Scan on edt_incident_times (cost=0.00..1745.31 rows=1293100 width=8) (actual time=0.429..105.290 rows=1294322
loops=1)
-> Hash (cost=1884.51..1884.51 rows=10656262 width=8) (actual time=2523.935..2523.936 rows=10666092 loops=1)
Buckets: 131072 Batches: 32 Memory Usage: 14056kB
-> Seq Scan on edt_e3id_action (cost=0.00..1884.51 rows=10656262 width=8) (actual time=0.204..1476.526 rows=10666092 loops
=1)
Planning Time: 174.027 ms
(slice0) Executor memory: 6351K bytes.
* (slice1) Executor memory: 64638K bytes avg x 4x(0) workers, 64710K bytes max (seg1). Work_mem: 22801K bytes max, 417669K bytes wanted.
* (slice2) Executor memory: 12562K bytes avg x 4x(0) workers, 12624K bytes max (seg0). Work_mem: 12006K bytes max, 44990K bytes wanted.
* (slice3) Executor memory: 13389K bytes avg x 4x(0) workers, 13429K bytes max (seg3). Work_mem: 11949K bytes max, 44990K bytes wanted.
Memory used: 131072kB
Memory wanted: 3760913kB
Optimizer: Pivotal Optimizer (GPORCA)
Execution Time: 279589.669 ms
(102 rows)
Time: 279765.913 ms (04:39.766)
ebi_dwh=# show optimizer_enable_groupagg;
optimizer_enable_groupagg
---------------------------
off
(1 row)
Time: 0.276 ms
ebi_dwh=# select version();
version
-------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------
PostgreSQL 14.4 (Cloudberry Database 1.0.0+344efe8 build dev) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1 20231218 (Red Hat 11.4.1-3), 64-bit compi
led on Jul 2 2024 11:37:20
(1 row)
What you think should happen instead
all databases had the same data and acutal statistics, greenplum is faster than cloudberry please find out why and optimize cloudberry
How to reproduce
count
GROUP BY
HAVING count() > 1000000
ORDER BY count() desc;
on a large table query the data with greenplum and with cloudberry
Operating System
RHEL 8 greenplum, RHEL9 cloudberry
Anything else
talked to Max Yang via Slack he told me to create this Issue.
There is a long discussion in cloudberry slack.
Cloudberry Database version
PostgreSQL 14.4 (Cloudberry Database 1.0.0+344efe8 build dev) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1 20231218 (Red Hat 11.4.1-3), 64-bit compi led on Jul 2 2024 11:37:20 and PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-18), 64-bit compiled on Jan 19 2024 06:49:03 Bhuvnesh C. (1 row)
What happened
greenplum 7.1 is faster than cloudberry here are the explain analyze:
What you think should happen instead
all databases had the same data and acutal statistics, greenplum is faster than cloudberry please find out why and optimize cloudberry
How to reproduce
count GROUP BY HAVING count() > 1000000 ORDER BY count() desc; on a large table query the data with greenplum and with cloudberry
Operating System
RHEL 8 greenplum, RHEL9 cloudberry
Anything else
talked to Max Yang via Slack he told me to create this Issue. There is a long discussion in cloudberry slack.
Are you willing to submit PR?
Code of Conduct