[Bug]: unused columns retrieved from compressed chunks through view

phemmer commented 2 years ago

What type of bug is this?

Performance issue

What subsystems and features are affected?

Compression

What happened?

When using a view, data retrieval from compressed chunks returns all columns regardless of whether they are used in the query. This causes significant performance impact.

TimescaleDB version affected

2.6.0

PostgreSQL version used

14.2

What operating system did you use?

docker image timescale/timescaledb:2.6.0-pg14

What installation method did you use?

Docker

What platform did you run on?

On prem/Self-hosted

Relevant log output and stack trace

No response

How can we reproduce the bug?

CREATE TABLE test (time TIMESTAMP, a INT, b INT, c INT);
SELECT create_hypertable('test', 'time', chunk_time_interval => INTERVAL '1h');
ALTER TABLE test SET (timescaledb.compress);
INSERT INTO test VALUES
    ('2022-03-25T11:00:00Z',1,1,1),
    ('2022-03-25T12:00:00Z',0,0,0)
;
CREATE VIEW testview AS SELECT * FROM test;
EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS) SELECT a FROM testview;

Append  (cost=0.00..71.00 rows=3400 width=4) (actual time=0.004..0.018 rows=2 loops=1)
  Buffers: shared hit=2
  ->  Seq Scan on _timescaledb_internal._hyper_14_18_chunk  (cost=0.00..27.00 rows=1700 width=4) (actual time=0.004..0.004 rows=1 loops=1)
        Output: _hyper_14_18_chunk.a
        Buffers: shared hit=1
  ->  Seq Scan on _timescaledb_internal._hyper_14_19_chunk  (cost=0.00..27.00 rows=1700 width=4) (actual time=0.013..0.013 rows=1 loops=1)
        Output: _hyper_14_19_chunk.a
        Buffers: shared hit=1
Planning:
  Buffers: shared hit=96 read=1
Planning Time: 0.602 ms
Execution Time: 0.031 ms

Note the Output: line for each chunk, and how it's only outputting column a. This is the correct behavior.

Compress a chunk and try again:

SELECT compress_chunk(chunk_schema || '.' || chunk_name)
    FROM timescaledb_information.chunks
    WHERE hypertable_name='test' AND range_start='2022-03-25T11:00:00Z';
EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS) SELECT a FROM testview;

Append  (cost=0.03..2300.00 rows=451700 width=4) (actual time=0.009..0.014 rows=2 loops=1)
  Buffers: shared hit=2
  ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_14_18_chunk  (cost=0.03..14.50 rows=450000 width=4) (actual time=0.009..0.010 rows=1 loops=1)
        Output: _hyper_14_18_chunk.a
        Buffers: shared hit=1
        ->  Seq Scan on _timescaledb_internal.compress_hyper_15_20_chunk  (cost=0.00..14.50 rows=450 width=36) (actual time=0.006..0.007 rows=1 loops=1)
              Output: compress_hyper_15_20_chunk._ts_meta_count, compress_hyper_15_20_chunk."time", compress_hyper_15_20_chunk.a, compress_hyper_15_20_chunk.b, compress_hyper_15_20_chunk.c
              Buffers: shared hit=1
  ->  Seq Scan on _timescaledb_internal._hyper_14_19_chunk  (cost=0.00..27.00 rows=1700 width=4) (actual time=0.002..0.003 rows=1 loops=1)
        Output: _hyper_14_19_chunk.a
        Buffers: shared hit=1
Planning:
  Buffers: shared hit=158 read=1
Planning Time: 1.589 ms
Execution Time: 0.063 ms

Notice how it's now outputting all columns on the compressed chunk.

phemmer commented 2 years ago

Another issue I noticed is that the cost estimate doesn't change when additional columns have to be decompressed.

fabriziomello commented 2 years ago

@phemmer thanks for reaching out.

Seems we're not pushing down the projection ... will check it more deeply on Monday.

phemmer commented 2 years ago

Any update on this? Was just checking against master and notice that it's worse now as there's even more columns being decompressed than originally reported:

compress_hyper_2_3_chunk._ts_meta_sequence_num
compress_hyper_2_3_chunk._ts_meta_min_1
compress_hyper_2_3_chunk._ts_meta_max_1

akuzm commented 2 years ago

@phemmer How do you know they are being decompressed? IIRC the columns we are decompressing are in the Output of DecompressChunk node. It has only the column 'a', looks correct to me. The Output of the underlying Seq Scan doesn't matter, because this node doesn't actually do the decompression.

akuzm commented 2 years ago

I think having the underlying Seq Scan do a projection should worsen the performance. There is no way to read a single column from a standard postgres table, we have to read the entire row. There is also detoasting, but as far as I know it's done lazily. When this underlying seq scan has a targetlist that is different from physical targetlist, this node has to do extra work of projection, that is not actually needed. I'm trying to fix this here (still a draft): https://github.com/timescale/timescaledb/pull/4234

phemmer commented 2 years ago

@phemmer How do you know they are being decompressed? IIRC the columns we are decompressing are in the Output of DecompressChunk node. It has only the column 'a', looks correct to me. The Output of the underlying Seq Scan doesn't matter, because this node doesn't actually do the decompression.

I guess I don't know, but it's the only explanation that made sense. Though I could be wrong. What brought me to the conclusion was that in production, I have a table which is much wider, where each chunk is several GB. Running the same query with and without a view shows a huge performance difference. Without the view it completes in a few hundred ms. With the view it takes about 10 seconds. When I do the explain, the timing difference is all inside the DecompressChunk, with the only behavior difference shown being the extra columns. Since the slow query show the columns, and the fast one does not, I concluded that those columns were being decompressed.

akuzm commented 2 years ago

Without the view it completes in a few hundred ms. With the view it takes about 10 seconds.

I tried some bigger table I have at hand, and it's the same no matter if I use a view or not. Maybe there is some other difference... Can you post the EXPLAIN for no view + compression, is it different from when using a view?

akuzm commented 2 years ago

Why it uses this targetlist for the underlying Seq Scan is a mystery to me. It shouldn't change when using a simple view like that. But this targetlist also shouldn't make things slower according to what I know. We have to investigate this.

phemmer commented 2 years ago

Here's a example of an un-simplified query I'm using in production (it's not the worst performing, but still very discernible):

Without view (fast):

EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
WITH tags AS NOT MATERIALIZED (
    SELECT * FROM telegraf.haproxy_backend_tag
    WHERE
      host LIKE ANY (array['%'])
      AND instance = 'myinstance'
      AND proxy LIKE ANY (array[/* ... */])
)
SELECT
    time_bucket('60.0s',"time") AS "time",
    host||' '||proxy,
    last(active_servers, time)
FROM telegraf.haproxy_backend t
INNER JOIN tags USING (tag_id)
WHERE
    "time" BETWEEN '2022-04-19T10:35:50.812Z' AND '2022-04-19T17:35:50.812Z'
GROUP BY 1, host, proxy
ORDER BY 1
;

Sort  (cost=65710.58..65893.58 rows=73200 width=74) (actual time=351.414..368.044 rows=183600 loops=1)
  Output: (time_bucket('00:01:00'::interval, t_1."time")), (((haproxy_backend_tag.host || ' '::text) || haproxy_backend_tag.proxy)), (last(t_1.active_servers, t_1."time")), haproxy_backend_tag.host, haproxy_backend_tag.proxy
  Sort Key: (time_bucket('00:01:00'::interval, t_1."time"))
  Sort Method: external merge  Disk: 15096kB
  Buffers: shared hit=27032, temp read=3586 written=7077
  ->  HashAggregate  (cost=47993.82..59796.18 rows=73200 width=74) (actual time=165.401..260.659 rows=183600 loops=1)
        Output: (time_bucket('00:01:00'::interval, t_1."time")), ((haproxy_backend_tag.host || ' '::text) || haproxy_backend_tag.proxy), last(t_1.active_servers, t_1."time"), haproxy_backend_tag.host, haproxy_backend_tag.proxy
        Group Key: time_bucket('00:01:00'::interval, t_1."time"), haproxy_backend_tag.host, haproxy_backend_tag.proxy
        Planned Partitions: 64  Batches: 65  Memory Usage: 1137kB  Disk Usage: 28432kB
        Buffers: shared hit=27032, temp read=1699 written=5189
        ->  Nested Loop  (cost=1002.61..20741.90 rows=529324 width=50) (actual time=0.312..97.352 rows=183600 loops=1)
              Output: time_bucket('00:01:00'::interval, t_1."time"), haproxy_backend_tag.host, haproxy_backend_tag.proxy, t_1.active_servers, t_1."time"
              Buffers: shared hit=27032
              ->  Gather  (cost=1000.00..6349.51 rows=371 width=34) (actual time=0.276..1.093 rows=1095 loops=1)
                    Output: haproxy_backend_tag.host, haproxy_backend_tag.proxy, haproxy_backend_tag.tag_id
                    Workers Planned: 1
                    Workers Launched: 1
                    Buffers: shared hit=2102
                    ->  Parallel Seq Scan on telegraf.haproxy_backend_tag  (cost=0.00..5312.41 rows=218 width=34) (actual time=0.033..15.065 rows=548 loops=2)
                          Output: haproxy_backend_tag.host, haproxy_backend_tag.proxy, haproxy_backend_tag.tag_id
                          Filter: ((haproxy_backend_tag.host ~~ ANY ('{%}'::text[])) AND (haproxy_backend_tag.instance = 'myinstance'::text) AND (haproxy_backend_tag.proxy ~~ ANY ('{admin_haproxy_com,adn-demo.haproxy.com,adn-ip-list,botdetection_css,botdetection_js,c3811-gsp_vault-acme_dcv,c3811_activate_haproxy,c3811_ui_priv,c3811_www,c3811_www_dev,c3811_www_download,demo.haproxy.net,dev.haproxy.com-anycast-dispatch,dev_haproxy_com,eleph.haproxy.com,haproxy.com,haproxy.com-anycast-dispatch,haproxy.com-priv-anycast-dispatch,http01.acme-dcv.haproxy.com,iplookup_test,kubernetes-registry.haproxy.com,per_ip_post_rate,redmine,waf_blocked,waf_blocked_report,waf_falsepositive_form,waf_report}'::text[])))
                          Rows Removed by Filter: 59712
                          Buffers: shared hit=2102
                          Worker 0:  actual time=0.039..29.481 rows=1055 loops=1
                            Buffers: shared hit=2044
              ->  Append  (cost=2.61..15.23 rows=2000 width=24) (actual time=0.012..0.067 rows=168 loops=1095)
                    Buffers: shared hit=24930
                    ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_21_29523_chunk t_1  (cost=2.61..2.61 rows=1000 width=24) (actual time=0.012..0.028 rows=78 loops=1095)
                          Output: t_1."time", t_1.active_servers, t_1.tag_id
                          Filter: ((t_1."time" >= '2022-04-19 14:35:50.812'::timestamp without time zone) AND (t_1."time" <= '2022-04-19 17:35:50.812'::timestamp without time zone) AND (haproxy_backend_tag.tag_id = t_1.tag_id))
                          Rows Removed by Filter: 145
                          Buffers: shared hit=12465
                          ->  Index Scan using compress_hyper_22_29578_chunk__compressed_hypertable_22_tag_id_ on _timescaledb_internal.compress_hyper_22_29578_chunk  (cost=0.42..2.61 rows=1 width=92) (actual time=0.004..0.004 rows=1 loops=1095)
                                Output: compress_hyper_22_29578_chunk._ts_meta_count, compress_hyper_22_29578_chunk."time", compress_hyper_22_29578_chunk.tag_id, compress_hyper_22_29578_chunk.active_servers
                                Index Cond: (compress_hyper_22_29578_chunk.tag_id = haproxy_backend_tag.tag_id)
                                Filter: ((compress_hyper_22_29578_chunk._ts_meta_max_1 >= '2022-04-19 14:35:50.812'::timestamp without time zone) AND (compress_hyper_22_29578_chunk._ts_meta_min_1 <= '2022-04-19 17:35:50.812'::timestamp without time zone))
                                Buffers: shared hit=4305
                    ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_21_29564_chunk t_2  (cost=2.61..2.61 rows=1000 width=24) (actual time=0.019..0.029 rows=89 loops=1095)
                          Output: t_2."time", t_2.active_servers, t_2.tag_id
                          Filter: ((t_2."time" >= '2022-04-19 14:35:50.812'::timestamp without time zone) AND (t_2."time" <= '2022-04-19 17:35:50.812'::timestamp without time zone) AND (haproxy_backend_tag.tag_id = t_2.tag_id))
                          Rows Removed by Filter: 134
                          Buffers: shared hit=12465
                          ->  Index Scan using compress_hyper_22_29616_chunk__compressed_hypertable_22_tag_id_ on _timescaledb_internal.compress_hyper_22_29616_chunk  (cost=0.42..2.61 rows=1 width=92) (actual time=0.004..0.004 rows=1 loops=1095)
                                Output: compress_hyper_22_29616_chunk._ts_meta_count, compress_hyper_22_29616_chunk."time", compress_hyper_22_29616_chunk.tag_id, compress_hyper_22_29616_chunk.active_servers
                                Index Cond: (compress_hyper_22_29616_chunk.tag_id = haproxy_backend_tag.tag_id)
                                Filter: ((compress_hyper_22_29616_chunk._ts_meta_max_1 >= '2022-04-19 14:35:50.812'::timestamp without time zone) AND (compress_hyper_22_29616_chunk._ts_meta_min_1 <= '2022-04-19 17:35:50.812'::timestamp without time zone))
                                Buffers: shared hit=4305
Query Identifier: -8684628961912603055
Planning:
  Buffers: shared hit=207
Planning Time: 1.285 ms
Execution Time: 383.932 ms

With view (slow)

EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
SELECT
    time_bucket('60.0s',"time") AS "time",
    host||' '||proxy,
    last(active_servers, time)
FROM c3811.haproxy_backend
WHERE
    "time" BETWEEN '2022-04-19T10:35:50.812Z' AND '2022-04-19T17:35:50.812Z'
    AND host LIKE ANY (array['%'])
    AND instance = 'myinstance'
    AND proxy LIKE ANY (array[/* ... */])
GROUP BY 1, host, proxy
ORDER BY 1
;

View definition:

CREATE OR REPLACE VIEW c3811.haproxy_backend AS
WITH tags AS NOT MATERIALIZED (
    SELECT *
    FROM telegraf.haproxy_backend_tag
    WHERE instance LIKE 'haproxy-c3811-%'
)
SELECT *
FROM tags
INNER JOIN telegraf.haproxy_backend t USING (tag_id)

Sort  (cost=6831.13..6833.96 rows=1129 width=74) (actual time=1252.511..1270.936 rows=183600 loops=1)
  Output: (time_bucket('00:01:00'::interval, haproxy_backend."time")), (((haproxy_backend.host || ' '::text) || haproxy_backend.proxy)), (last(haproxy_backend.active_servers, haproxy_backend."time")), haproxy_backend.host, haproxy_backend.proxy
  Sort Key: (time_bucket('00:01:00'::interval, haproxy_backend."time"))
  Sort Method: external merge  Disk: 15096kB
  Buffers: shared hit=447432, temp read=3557 written=5052
  ->  HashAggregate  (cost=6751.31..6773.89 rows=1129 width=74) (actual time=948.598..1131.498 rows=183600 loops=1)
        Output: (time_bucket('00:01:00'::interval, haproxy_backend."time")), ((haproxy_backend.host || ' '::text) || haproxy_backend.proxy), last(haproxy_backend.active_servers, haproxy_backend."time"), haproxy_backend.host, haproxy_backend.proxy
        Group Key: time_bucket('00:01:00'::interval, haproxy_backend."time"), haproxy_backend.host, haproxy_backend.proxy
        Batches: 5  Memory Usage: 9753kB  Disk Usage: 15424kB
        Buffers: shared hit=447432, temp read=1670 written=3164
        ->  Subquery Scan on haproxy_backend  (cost=1002.64..6694.24 rows=5707 width=50) (actual time=0.828..863.932 rows=183600 loops=1)
              Output: time_bucket('00:01:00'::interval, haproxy_backend."time"), haproxy_backend.host, haproxy_backend.proxy, haproxy_backend.active_servers, haproxy_backend."time"
              Buffers: shared hit=447432
              ->  Nested Loop  (cost=1002.64..6622.90 rows=5707 width=802) (actual time=0.826..835.058 rows=183600 loops=1)
                    Output: NULL::bigint, NULL::text, haproxy_backend_tag.host, NULL::text, haproxy_backend_tag.proxy, NULL::text, NULL::text, t_1."time", t_1.active_servers, NULL::bigint, NULL::bigint, NULL::text, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::text, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::text, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::double precision, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint
                    Buffers: shared hit=447432
                    ->  Gather  (cost=1000.00..6481.78 rows=4 width=34) (actual time=0.515..1.088 rows=1095 loops=1)
                          Output: haproxy_backend_tag.host, haproxy_backend_tag.proxy, haproxy_backend_tag.tag_id
                          Workers Planned: 1
                          Workers Launched: 1
                          Buffers: shared hit=2102
                          ->  Parallel Seq Scan on telegraf.haproxy_backend_tag  (cost=0.00..5481.38 rows=2 width=34) (actual time=0.056..16.298 rows=548 loops=2)
                                Output: haproxy_backend_tag.host, haproxy_backend_tag.proxy, haproxy_backend_tag.tag_id
                                Filter: ((haproxy_backend_tag.host ~~ ANY ('{%}'::text[])) AND (haproxy_backend_tag.instance ~~ 'haproxy-c3811-%'::text) AND (haproxy_backend_tag.instance = 'myinstance'::text) AND (haproxy_backend_tag.proxy ~~ ANY ('{admin_haproxy_com,adn-demo.haproxy.com,adn-ip-list,botdetection_css,botdetection_js,c3811-gsp_vault-acme_dcv,c3811_activate_haproxy,c3811_ui_priv,c3811_www,c3811_www_dev,c3811_www_download,demo.haproxy.net,dev.haproxy.com-anycast-dispatch,dev_haproxy_com,eleph.haproxy.com,haproxy.com,haproxy.com-anycast-dispatch,haproxy.com-priv-anycast-dispatch,http01.acme-dcv.haproxy.com,iplookup_test,kubernetes-registry.haproxy.com,per_ip_post_rate,redmine,waf_blocked,waf_blocked_report,waf_falsepositive_form,waf_report}'::text[])))
                                Rows Removed by Filter: 59712
                                Buffers: shared hit=2102
                                Worker 0:  actual time=0.061..32.209 rows=1090 loops=1
                                  Buffers: shared hit=2086
                    ->  Append  (cost=2.64..15.28 rows=2000 width=24) (actual time=0.155..0.686 rows=168 loops=1095)
                          Buffers: shared hit=445330
                          ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_21_29523_chunk t_1  (cost=2.64..2.64 rows=1000 width=24) (actual time=0.154..0.335 rows=78 loops=1095)
                                Output: t_1."time", t_1.active_servers, t_1.tag_id
                                Filter: ((t_1."time" >= '2022-04-19 14:35:50.812'::timestamp without time zone) AND (t_1."time" <= '2022-04-19 17:35:50.812'::timestamp without time zone) AND (haproxy_backend_tag.tag_id = t_1.tag_id))
                                Rows Removed by Filter: 145
                                Buffers: shared hit=222665
                                ->  Index Scan using compress_hyper_22_29578_chunk__compressed_hypertable_22_tag_id_ on _timescaledb_internal.compress_hyper_22_29578_chunk  (cost=0.42..2.64 rows=1 width=92) (actual time=0.005..0.005 rows=1 loops=1095)
                                      Output: compress_hyper_22_29578_chunk._ts_meta_count, compress_hyper_22_29578_chunk."time", compress_hyper_22_29578_chunk.tag_id, compress_hyper_22_29578_chunk.active_servers, compress_hyper_22_29578_chunk.agg_server_check_status, compress_hyper_22_29578_chunk.backup_servers, compress_hyper_22_29578_chunk.cookie, compress_hyper_22_29578_chunk.ctime, compress_hyper_22_29578_chunk.ctime_max, compress_hyper_22_29578_chunk.h2_backend_open_streams, compress_hyper_22_29578_chunk.h2_backend_total_streams, compress_hyper_22_29578_chunk.h2_data_rcvd, compress_hyper_22_29578_chunk.h2_detected_conn_protocol_errors, compress_hyper_22_29578_chunk.h2_detected_strm_protocol_errors, compress_hyper_22_29578_chunk.h2_goaway_rcvd, compress_hyper_22_29578_chunk.h2_goaway_resp, compress_hyper_22_29578_chunk.h2_headers_rcvd, compress_hyper_22_29578_chunk.h2_open_connections, compress_hyper_22_29578_chunk.h2_rst_stream_rcvd, compress_hyper_22_29578_chunk.h2_rst_stream_resp, compress_hyper_22_29578_chunk.h2_settings_rcvd, compress_hyper_22_29578_chunk.h2_total_connections, compress_hyper_22_29578_chunk.iid, compress_hyper_22_29578_chunk.lastchg, compress_hyper_22_29578_chunk.lastsess, compress_hyper_22_29578_chunk.mode, compress_hyper_22_29578_chunk.pid, compress_hyper_22_29578_chunk.qcur, compress_hyper_22_29578_chunk.qmax, compress_hyper_22_29578_chunk.qtime, compress_hyper_22_29578_chunk.qtime_max, compress_hyper_22_29578_chunk.rate, compress_hyper_22_29578_chunk.rate_max, compress_hyper_22_29578_chunk.reuse, compress_hyper_22_29578_chunk.rtime, compress_hyper_22_29578_chunk.rtime_max, compress_hyper_22_29578_chunk.scur, compress_hyper_22_29578_chunk.sid, compress_hyper_22_29578_chunk.slim, compress_hyper_22_29578_chunk.smax, compress_hyper_22_29578_chunk.srv_abort, compress_hyper_22_29578_chunk.ssl_failed_handshake, compress_hyper_22_29578_chunk.ssl_reused_sess, compress_hyper_22_29578_chunk.ssl_sess, compress_hyper_22_29578_chunk.status, compress_hyper_22_29578_chunk.stot, compress_hyper_22_29578_chunk.ttime, compress_hyper_22_29578_chunk.ttime_max, compress_hyper_22_29578_chunk.uweight, compress_hyper_22_29578_chunk.weight, compress_hyper_22_29578_chunk.wredis, compress_hyper_22_29578_chunk.wretr, compress_hyper_22_29578_chunk.wrew, compress_hyper_22_29578_chunk.bin, compress_hyper_22_29578_chunk.bout, compress_hyper_22_29578_chunk.cache_hits, compress_hyper_22_29578_chunk.cache_lookups, compress_hyper_22_29578_chunk.chkdown, compress_hyper_22_29578_chunk.cli_abort, compress_hyper_22_29578_chunk.connect, compress_hyper_22_29578_chunk.downtime, compress_hyper_22_29578_chunk.duration, compress_hyper_22_29578_chunk.econ, compress_hyper_22_29578_chunk.eint, compress_hyper_22_29578_chunk.eresp, compress_hyper_22_29578_chunk."http_response.1xx", compress_hyper_22_29578_chunk."http_response.2xx", compress_hyper_22_29578_chunk."http_response.3xx", compress_hyper_22_29578_chunk."http_response.4xx", compress_hyper_22_29578_chunk."http_response.5xx", compress_hyper_22_29578_chunk."http_response.other", compress_hyper_22_29578_chunk.lbtot, compress_hyper_22_29578_chunk.req_tot
                                      Index Cond: (compress_hyper_22_29578_chunk.tag_id = haproxy_backend_tag.tag_id)
                                      Filter: ((compress_hyper_22_29578_chunk._ts_meta_max_1 >= '2022-04-19 14:35:50.812'::timestamp without time zone) AND (compress_hyper_22_29578_chunk._ts_meta_min_1 <= '2022-04-19 17:35:50.812'::timestamp without time zone))
                                      Buffers: shared hit=4305
                          ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_21_29564_chunk t_2  (cost=2.64..2.64 rows=1000 width=24) (actual time=0.257..0.336 rows=89 loops=1095)
                                Output: t_2."time", t_2.active_servers, t_2.tag_id
                                Filter: ((t_2."time" >= '2022-04-19 14:35:50.812'::timestamp without time zone) AND (t_2."time" <= '2022-04-19 17:35:50.812'::timestamp without time zone) AND (haproxy_backend_tag.tag_id = t_2.tag_id))
                                Rows Removed by Filter: 134
                                Buffers: shared hit=222665
                                ->  Index Scan using compress_hyper_22_29616_chunk__compressed_hypertable_22_tag_id_ on _timescaledb_internal.compress_hyper_22_29616_chunk  (cost=0.42..2.64 rows=1 width=92) (actual time=0.005..0.005 rows=1 loops=1095)
                                      Output: compress_hyper_22_29616_chunk._ts_meta_count, compress_hyper_22_29616_chunk."time", compress_hyper_22_29616_chunk.tag_id, compress_hyper_22_29616_chunk.active_servers, compress_hyper_22_29616_chunk.agg_server_check_status, compress_hyper_22_29616_chunk.backup_servers, compress_hyper_22_29616_chunk.cookie, compress_hyper_22_29616_chunk.ctime, compress_hyper_22_29616_chunk.ctime_max, compress_hyper_22_29616_chunk.h2_backend_open_streams, compress_hyper_22_29616_chunk.h2_backend_total_streams, compress_hyper_22_29616_chunk.h2_data_rcvd, compress_hyper_22_29616_chunk.h2_detected_conn_protocol_errors, compress_hyper_22_29616_chunk.h2_detected_strm_protocol_errors, compress_hyper_22_29616_chunk.h2_goaway_rcvd, compress_hyper_22_29616_chunk.h2_goaway_resp, compress_hyper_22_29616_chunk.h2_headers_rcvd, compress_hyper_22_29616_chunk.h2_open_connections, compress_hyper_22_29616_chunk.h2_rst_stream_rcvd, compress_hyper_22_29616_chunk.h2_rst_stream_resp, compress_hyper_22_29616_chunk.h2_settings_rcvd, compress_hyper_22_29616_chunk.h2_total_connections, compress_hyper_22_29616_chunk.iid, compress_hyper_22_29616_chunk.lastchg, compress_hyper_22_29616_chunk.lastsess, compress_hyper_22_29616_chunk.mode, compress_hyper_22_29616_chunk.pid, compress_hyper_22_29616_chunk.qcur, compress_hyper_22_29616_chunk.qmax, compress_hyper_22_29616_chunk.qtime, compress_hyper_22_29616_chunk.qtime_max, compress_hyper_22_29616_chunk.rate, compress_hyper_22_29616_chunk.rate_max, compress_hyper_22_29616_chunk.reuse, compress_hyper_22_29616_chunk.rtime, compress_hyper_22_29616_chunk.rtime_max, compress_hyper_22_29616_chunk.scur, compress_hyper_22_29616_chunk.sid, compress_hyper_22_29616_chunk.slim, compress_hyper_22_29616_chunk.smax, compress_hyper_22_29616_chunk.srv_abort, compress_hyper_22_29616_chunk.ssl_failed_handshake, compress_hyper_22_29616_chunk.ssl_reused_sess, compress_hyper_22_29616_chunk.ssl_sess, compress_hyper_22_29616_chunk.status, compress_hyper_22_29616_chunk.stot, compress_hyper_22_29616_chunk.ttime, compress_hyper_22_29616_chunk.ttime_max, compress_hyper_22_29616_chunk.uweight, compress_hyper_22_29616_chunk.weight, compress_hyper_22_29616_chunk.wredis, compress_hyper_22_29616_chunk.wretr, compress_hyper_22_29616_chunk.wrew, compress_hyper_22_29616_chunk.bin, compress_hyper_22_29616_chunk.bout, compress_hyper_22_29616_chunk.cache_hits, compress_hyper_22_29616_chunk.cache_lookups, compress_hyper_22_29616_chunk.chkdown, compress_hyper_22_29616_chunk.cli_abort, compress_hyper_22_29616_chunk.connect, compress_hyper_22_29616_chunk.downtime, compress_hyper_22_29616_chunk.duration, compress_hyper_22_29616_chunk.econ, compress_hyper_22_29616_chunk.eint, compress_hyper_22_29616_chunk.eresp, compress_hyper_22_29616_chunk."http_response.1xx", compress_hyper_22_29616_chunk."http_response.2xx", compress_hyper_22_29616_chunk."http_response.3xx", compress_hyper_22_29616_chunk."http_response.4xx", compress_hyper_22_29616_chunk."http_response.5xx", compress_hyper_22_29616_chunk."http_response.other", compress_hyper_22_29616_chunk.lbtot, compress_hyper_22_29616_chunk.req_tot
                                      Index Cond: (compress_hyper_22_29616_chunk.tag_id = haproxy_backend_tag.tag_id)
                                      Filter: ((compress_hyper_22_29616_chunk._ts_meta_max_1 >= '2022-04-19 14:35:50.812'::timestamp without time zone) AND (compress_hyper_22_29616_chunk._ts_meta_min_1 <= '2022-04-19 17:35:50.812'::timestamp without time zone))
                                      Buffers: shared hit=4305
Query Identifier: 6896141023051905804
Planning:
  Buffers: shared hit=208
Planning Time: 1.956 ms
Execution Time: 1378.840 ms

akuzm commented 2 years ago

OK, I think I can reproduce it and a view is indeed slower because it decompresses all the columns, even when the EXPLAIN is completely the same.

Probably the reason is here https://github.com/timescale/timescaledb/blob/main/tsl/src/nodes/decompress_chunk/planner.c#L130 We use the RangeTblEntry.selectedCols to determine which columns to decompress, but it's wrong. These are the columns that need select permissions. Because the view does 'SELECT *', it needs permissions for all the columns, so we think we have to decompress them. I'll see how it can be fixed.

timescale / timescaledb