ChenHuajun / pg_roaringbitmap

RoaringBitmap extension for PostgreSQL
Apache License 2.0
223 stars 37 forks source link

parallel aggregate may product wrong result #6

Closed ChenHuajun closed 5 years ago

ChenHuajun commented 5 years ago

parallel aggregate sometime product wrong result

env: PostgreSQL 10.10 pg_roaringbitmap 0.4.0 CentOS 7.6

test case:

set max_parallel_workers=8;
set max_parallel_workers_per_gather=2;
set parallel_setup_cost=0;
set parallel_tuple_cost=0;
set min_parallel_table_scan_size=0;
create table if not exists bitmap_test_tb1(id int, bitmap roaringbitmap);
insert into bitmap_test_tb1 values (NULL,NULL);
insert into bitmap_test_tb1 select id,rb_build(ARRAY[id]) from generate_series(1,10000)id;
insert into bitmap_test_tb1 values (NULL,NULL);
insert into bitmap_test_tb1 values (10001,rb_build(ARRAY[10,100,1000,10000,10001]));

with a as
(
select rb_or_agg(bitmap) bitmap from bitmap_test_tb1
)
select rb_cardinality(bitmap),rb_min(bitmap),rb_max(bitmap) from a;

expected result:

 rb_cardinality | rb_min | rb_max
----------------+--------+--------
           10001 |      1 |  10001
(1 row)

actually result:

 rb_cardinality | rb_min | rb_max
----------------+--------+--------
           3139 |      1 |  6907
(1 row)

or others

plan:

explain(costs off) 
with a as
(
select rb_or_agg(bitmap) bitmap from bitmap_test_tb1
)
select rb_cardinality(bitmap),rb_min(bitmap),rb_max(bitmap) from a;
                           QUERY PLAN                           
----------------------------------------------------------------
 CTE Scan on a
   CTE a
     ->  Finalize Aggregate
           ->  Gather
                 Workers Planned: 2
                 ->  Partial Aggregate
                       ->  Parallel Seq Scan on bitmap_test_tb1
(7 rows)

reason: PostgreSQL's combine_aggregates() only init fcinfo->isnull once but calls rb_deserialize() three times above. In this example, the data is divided into 3 parallel scans, one of which is empty, and the result of partial aggregation of 3 data sets is passed to combine_aggregates() function by the order of non-null->null->non-null. Since dsinfo->isnull is set to true when processing the second dataset, dsinfo->isnull remains when process the third dataset, so the third dataset is mistaken for null. The result of the calculation is incorrect.

static void
combine_aggregates(AggState *aggstate, AggStatePerGroup pergroup)
{
...
                FunctionCallInfo dsinfo = &pertrans->deserialfn_fcinfo;
...
                fcinfo->arg[1] = FunctionCallInvoke(dsinfo);
                fcinfo->argnull[1] = dsinfo->isnull;
...
}