Multi-node TimeScaleDB: Explain verbose on access node shows pushdown happening to only one data node

karandhodi commented 2 years ago

I am trying to run a query on Multi-node TimeScaleDB. However the query speed is not satisfactory at all. On doing explain verbose, I see even though chunks for a time interval are present on data0 and data1 nodes, the query gets pushed down to only data1.

Here is the output:

explain verbose select a as a, b as b, count(g.c) as c, sum(d) as d, sum(e) as e, sum(f) AS f   from  table_12 as g  where g.serialnum = '12345' and ls >= '2021-11-12 04:00:00' and ls <= '2021-11-12 06:59:00'  group by g.serialnum, a, b  order by e desc limit 10;
                                                                                                                                                                                                     QUERY PLAN                                                                                                                                                                           
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=3189448.60..3189448.63 rows=10 width=105)
   Output: f.a, f.b, (count(g.c)), (sum(g.d)), (sum(g.e)), (sum(g.f)), g.serialnum
   ->  Sort  (cost=3189448.60..3195433.16 rows=2393822 width=105)
         Output: f.a, f.b, (count(g.c)), (sum(g.d)), (sum(g.e)), (sum(g.f)), g.serialnum
         Sort Key: (sum(f.total_octets)) DESC
         ->  Custom Scan (DataNodeScan)  (cost=240635.98..3137718.97 rows=2393822 width=105)
               Output: f.a, f.b, (count(g.c)), (sum(g.d)), (sum(g.e)), (sum(g.f)), g.serialnum
               Relations: Aggregate on (public.table_12 f)
               Data node: timescaledb-data-1
               **Chunks: _dist_hyper_349_448_chunk, _dist_hyper_349_429_chunk, _dist_hyper_349_420_chunk**
               Remote SQL: SELECT a, b, count(g.c), sum(d), sum(e), sum(f), serialnum FROM public.table_12 WHERE _timescaledb_internal.chunks_in(public.table_12.*, ARRAY[326, 316, 309]) AND ((ls >= '2021-11-12 04:00:00'::timestamp without time zone)) AND ((ls <= '2021-11-12 06:59:00'::timestamp without time zone)) AND ((serialnum = '12345'::text)) GROUP BY 7, 1, 2

However on running the query:

SELECT data_nodes,chunk_name,range_start,range_end FROM timescaledb_information.chunks where hypertable_name='table_12';

      data_nodes      |        chunk_name         |      range_start       |       range_end
----------------------+---------------------------+------------------------+------------------------
 {timescaledb-data-1} | _dist_hyper_349_387_chunk | 2021-11-12 01:00:00+00 | 2021-11-12 02:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_388_chunk | 2021-11-12 01:00:00+00 | 2021-11-12 02:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_391_chunk | 2021-11-12 02:00:00+00 | 2021-11-12 03:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_392_chunk | 2021-11-12 02:00:00+00 | 2021-11-12 03:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_412_chunk | 2021-11-12 03:00:00+00 | 2021-11-12 04:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_413_chunk | 2021-11-12 03:00:00+00 | 2021-11-12 04:00:00+00
 **{timescaledb-data-1} | _dist_hyper_349_420_chunk | 2021-11-12 04:00:00+00 | 2021-11-12 05:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_421_chunk | 2021-11-12 04:00:00+00 | 2021-11-12 05:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_428_chunk | 2021-11-12 05:00:00+00 | 2021-11-12 06:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_429_chunk | 2021-11-12 05:00:00+00 | 2021-11-12 06:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_448_chunk | 2021-11-12 06:00:00+00 | 2021-11-12 07:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_449_chunk | 2021-11-12 06:00:00+00 | 2021-11-12 07:00:00+00**
 {timescaledb-data-0} | _dist_hyper_349_456_chunk | 2021-11-12 07:00:00+00 | 2021-11-12 08:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_457_chunk | 2021-11-12 07:00:00+00 | 2021-11-12 08:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_463_chunk | 2021-11-12 08:00:00+00 | 2021-11-12 09:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_464_chunk | 2021-11-12 08:00:00+00 | 2021-11-12 09:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_484_chunk | 2021-11-12 09:00:00+00 | 2021-11-12 10:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_485_chunk | 2021-11-12 09:00:00+00 | 2021-11-12 10:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_491_chunk | 2021-11-12 10:00:00+00 | 2021-11-12 11:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_494_chunk | 2021-11-12 10:00:00+00 | 2021-11-12 11:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_500_chunk | 2021-11-12 11:00:00+00 | 2021-11-12 12:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_501_chunk | 2021-11-12 11:00:00+00 | 2021-11-12 12:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_520_chunk | 2021-11-12 12:00:00+00 | 2021-11-12 13:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_521_chunk | 2021-11-12 12:00:00+00 | 2021-11-12 13:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_528_chunk | 2021-11-12 13:00:00+00 | 2021-11-12 14:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_529_chunk | 2021-11-12 13:00:00+00 | 2021-11-12 14:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_536_chunk | 2021-11-12 14:00:00+00 | 2021-11-12 15:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_537_chunk | 2021-11-12 14:00:00+00 | 2021-11-12 15:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_553_chunk | 2021-11-12 15:00:00+00 | 2021-11-12 16:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_554_chunk | 2021-11-12 15:00:00+00 | 2021-11-12 16:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_564_chunk | 2021-11-12 16:00:00+00 | 2021-11-12 17:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_566_chunk | 2021-11-12 16:00:00+00 | 2021-11-12 17:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_572_chunk | 2021-11-12 17:00:00+00 | 2021-11-12 18:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_573_chunk | 2021-11-12 17:00:00+00 | 2021-11-12 18:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_592_chunk | 2021-11-12 18:00:00+00 | 2021-11-12 19:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_593_chunk | 2021-11-12 18:00:00+00 | 2021-11-12 19:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_600_chunk | 2021-11-12 19:00:00+00 | 2021-11-12 20:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_601_chunk | 2021-11-12 19:00:00+00 | 2021-11-12 20:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_608_chunk | 2021-11-12 20:00:00+00 | 2021-11-12 21:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_609_chunk | 2021-11-12 20:00:00+00 | 2021-11-12 21:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_625_chunk | 2021-11-12 21:00:00+00 | 2021-11-12 22:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_626_chunk | 2021-11-12 21:00:00+00 | 2021-11-12 22:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_636_chunk | 2021-11-12 22:00:00+00 | 2021-11-12 23:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_637_chunk | 2021-11-12 22:00:00+00 | 2021-11-12 23:00:00+00
 {timescaledb-data-1} | _dist_hyper_349_644_chunk | 2021-11-12 23:00:00+00 | 2021-11-13 00:00:00+00
 {timescaledb-data-0} | _dist_hyper_349_645_chunk | 2021-11-12 23:00:00+00 | 2021-11-13 00:00:00+00

The chunks for time range 4:00 to 6:59 all belong to both data0 and data1. How do I make sure the query gets pushed down to both data nodes to improve the query speed?

Thanks in advance

mkindahl commented 2 years ago

@karandhodi Thanks for the bug report! Did some minor editing to make it more readable. Hope that is OK.

erimatnor commented 2 years ago

@karandhodi Can you share with us how you partitioned your hypertable? I assume g.serialnum is the space partitioning dimension? In that case, the EXPLAIN looks correct because you are selecting data from only one data node using g.serialnum = '12345'.

karandhodi commented 2 years ago

Yes that is right g.serialnum is the space partitioning dimension. In that case how do I partition the table to make sure the query gets pushed down to all data nodes? I have tried many things but the query time is hovering around 30-40 sec which is pretty slow. Thank you

erimatnor commented 2 years ago

@karandhodi There's nothing strictly wrong with how you partition, especially if you would group and aggregate data over many serialnums in a single query. Then you'd see every node involved. If you want to many data nodes involved when you query for only one serialnum, you could try to partition on only time or use another space dimension.

I believe the main explanation for poor query performance here is the lack of LIMIT push-down (@pmwkaa), and parallel query execution. LIMIT push-down should always be possible if only one node is involved. That's one area of potential improvement to we will work on.

In the meantime, you can try setting timescaledb.remote_data_fetcher=rowbyrow, which would enable parallel execution on the data node if it is enabled there (it should be by default), but note that that setting might not work with some queries that include subqueries.

nikkhils commented 2 years ago

I believe the main explanation for poor query performance here is the lack of LIMIT push-down (@pmwkaa), and parallel query execution. LIMIT push-down should always be possible if only one node is involved. That's one area of potential improvement to we will work on.

I believe the other issue is that even the ORDER BY is not being pushed down to the single data node. Which then doesn't allow the LIMIT to be pushed down.

nikkhils commented 2 years ago

This will be resolved via #3807

svenklemm commented 4 months ago

multinode has been removed

timescale / timescaledb

Multi-node TimeScaleDB: Explain verbose on access node shows pushdown happening to only one data node #3804