timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
16.94k stars 858 forks source link

Segfault in get_aggsplit #3742

Closed hardikm10 closed 2 years ago

hardikm10 commented 2 years ago

Relevant system information: OS: [e.g. Ubuntu 16.04, Windows 10 x64, etc]: Docker / timescale/timescaledb:2.4.2-pg13 PostgreSQL version (output of postgres --version): 13.4 TimescaleDB version (output of \dx in psql): 2.4.2 Installation method: [e.g., "using Docker", "apt install", "source"] : "Using Docker"

Describe the bug On a 1AN/1DN multinode setup, running the below query with enable_partitionwise_aggregate as ON causes segfault: SELECT 3 COUNT() FROM foo; However, if a new data node is added, the segfault doesn't occur.

To Reproduce

create table foo (a integer, b integer, c integer);

select table_name from create_distributed_hypertable('foo', 'a','b', chunk_time_interval=> 10);

insert into foo values( 3 , 16 , 20);
insert into foo values( 1 , 10 , 20);
insert into foo values( 1 , 11 , 20);
insert into foo values( 1 , 12 , 20);
insert into foo values( 1 , 13 , 20);
insert into foo values( 1 , 14 , 20);
insert into foo values( 2 , 14 , 20);
insert into foo values( 2 , 15 , 20);
insert into foo values( 2 , 16 , 20);

SHOW enable_partitionwise_aggregate ;

SET enable_partitionwise_aggregate TO OFF;

SELECT COUNT(*) FROM foo;  -- works fine

SELECT 3* COUNT(*) FROM foo; --  works fine

SET enable_partitionwise_aggregate TO ON;
SHOW enable_partitionwise_aggregate ;
SELECT COUNT(*) FROM foo;   --works fine

SELECT 3* COUNT(*) FROM foo; -- FAILS

Here's the coredump:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f2d645ec918 in get_aggsplit (rel=0x555c85685f98, rel=0x555c85685f98) at /build/timescaledb/tsl/src/fdw/estimate.c:51
51  /build/timescaledb/tsl/src/fdw/estimate.c: No such file or directory.
(gdb) bt
#0  0x00007f2d645ec918 in get_aggsplit (rel=0x555c85685f98, rel=0x555c85685f98) at /build/timescaledb/tsl/src/fdw/estimate.c:51
#1  get_upper_rel_estimate (ce=<synthetic pointer>, rel=0x555c85685f98, root=0x555c856066b8) at /build/timescaledb/tsl/src/fdw/estimate.c:101
#2  fdw_estimate_path_cost_size (root=0x555c856066b8, rel=0x555c85685f98, pathkeys=0x0, p_rows=0x7ffdc5adb360, p_width=0x7ffdc5adb34c, p_startup_cost=0x7ffdc5adb358, p_total_cost=0x7ffdc5adb350)
    at /build/timescaledb/tsl/src/fdw/estimate.c:230
#3  0x00007f2d645ed585 in add_foreign_grouping_paths (root=<optimized out>, input_rel=<optimized out>, grouped_rel=<optimized out>, extra=<optimized out>, create_path=0x7f2d645e7b20 <data_node_scan_upper_path_create>)
    at /build/timescaledb/tsl/src/fdw/scan_plan.c:628
#4  0x0000555c833f1105 in create_ordinary_grouping_paths (root=<optimized out>, input_rel=0x555c85605558, grouped_rel=<optimized out>, agg_costs=0x7ffdc5adb800, gd=0x0, extra=<optimized out>, partially_grouped_rel_p=0x7ffdc5adb618)
    at optimizer/plan/./build/../src/backend/optimizer/plan/planner.c:4171
#5  0x0000555c833f2483 in create_partitionwise_grouping_paths (extra=0x7ffdc5adb840, patype=<optimized out>, gd=0x0, agg_costs=0x7ffdc5adb800, partially_grouped_rel=<optimized out>, grouped_rel=0x555c85685840, input_rel=0x555c85606d28,
    root=0x555c856066b8) at optimizer/plan/./build/../src/backend/optimizer/plan/planner.c:7657
#6  create_ordinary_grouping_paths (root=<optimized out>, input_rel=0x555c85606d28, grouped_rel=<optimized out>, agg_costs=0x7ffdc5adb800, gd=0x0, extra=<optimized out>, partially_grouped_rel_p=0x7ffdc5adb7e8)
    at optimizer/plan/./build/../src/backend/optimizer/plan/planner.c:4123
#7  0x0000555c833ecb3d in create_grouping_paths (gd=0x0, agg_costs=0x7ffdc5adb800, target_parallel_safe=true, target=0x555c85685370, input_rel=0x555c85606d28, root=0x555c856066b8)
    at optimizer/plan/./build/../src/backend/optimizer/plan/planner.c:3896
#8  grouping_planner (root=<optimized out>, inheritance_update=false, tuple_fraction=<optimized out>) at optimizer/plan/./build/../src/backend/optimizer/plan/planner.c:2201
#9  0x0000555c833eddc8 in subquery_planner (glob=glob@entry=0x555c85492330, parse=parse@entry=0x555c85492440, parent_root=parent_root@entry=0x0, hasRecursion=hasRecursion@entry=false, tuple_fraction=tuple_fraction@entry=0)
    at optimizer/plan/./build/../src/backend/optimizer/plan/planner.c:1015
#10 0x0000555c833ef0c1 in standard_planner (parse=0x555c85492440, query_string=<optimized out>, cursorOptions=256, boundParams=<optimized out>) at optimizer/plan/./build/../src/backend/optimizer/plan/planner.c:405
#11 0x00007f2d8714b80d in pgss_planner (parse=0x555c85492440, query_string=0x555c854914b8 "SELECT 3* COUNT(*) FROM foo;", cursorOptions=256, boundParams=0x0) at ./build/../contrib/pg_stat_statements/pg_stat_statements.c:991
#12 0x00007f2d64668ce4 in timescaledb_planner (parse=0x555c85492440, query_string=0x555c854914b8 "SELECT 3* COUNT(*) FROM foo;", cursor_opts=256, bound_params=0x0) at /build/timescaledb/src/planner.c:305
#13 0x0000555c834bd469 in planner (boundParams=<optimized out>, cursorOptions=<optimized out>, query_string=0x555c854914b8 "SELECT 3* COUNT(*) FROM foo;", parse=0x555c85492440)
    at optimizer/plan/./build/../src/backend/optimizer/plan/planner.c:273
#14 pg_plan_query (querytree=0x555c85492440, query_string=query_string@entry=0x555c854914b8 "SELECT 3* COUNT(*) FROM foo;", cursorOptions=cursorOptions@entry=256, boundParams=boundParams@entry=0x0)
    at tcop/./build/../src/backend/tcop/postgres.c:875
#15 0x0000555c834bd55a in pg_plan_queries (querytrees=0x555c854931c8, query_string=query_string@entry=0x555c854914b8 "SELECT 3* COUNT(*) FROM foo;", cursorOptions=cursorOptions@entry=256, boundParams=boundParams@entry=0x0)
    at tcop/./build/../src/backend/tcop/postgres.c:966
#16 0x0000555c834beeac in exec_simple_query (query_string=0x555c854914b8 "SELECT 3* COUNT(*) FROM foo;") at tcop/./build/../src/backend/tcop/postgres.c:1158
#17 0x0000555c834c1727 in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=<optimized out>, username=<optimized out>) at tcop/./build/../src/backend/tcop/postgres.c:4339
#18 0x0000555c8343feb8 in BackendRun (port=0x555c854e4820) at postmaster/./build/../src/backend/postmaster/postmaster.c:4526
#19 BackendStartup (port=0x555c854e4820) at postmaster/./build/../src/backend/postmaster/postmaster.c:4210
#20 ServerLoop () at postmaster/./build/../src/backend/postmaster/postmaster.c:1739
#21 0x0000555c83440db4 in PostmasterMain (argc=<optimized out>, argv=<optimized out>) at postmaster/./build/../src/backend/postmaster/postmaster.c:1412
#22 0x0000555c83142989 in main (argc=17, argv=0x555c8548a030) at main/./build/../src/backend/main/main.c:210

Expected behavior no segfault

Actual behavior segfault

akuzm commented 2 years ago

Looks like this one: https://github.com/timescale/timescaledb/issues/3672

hardikm10 commented 2 years ago

@akuzm may be, the stack trace looks quite similar.

svenklemm commented 2 years ago

Fixed by #3708