Closed phemmer closed 2 years ago
Another issue I noticed is that the cost estimate doesn't change when additional columns have to be decompressed.
@phemmer thanks for reaching out.
Seems we're not pushing down the projection ... will check it more deeply on Monday.
Any update on this? Was just checking against master and notice that it's worse now as there's even more columns being decompressed than originally reported:
@phemmer How do you know they are being decompressed? IIRC the columns we are decompressing are in the Output of DecompressChunk node. It has only the column 'a', looks correct to me. The Output of the underlying Seq Scan doesn't matter, because this node doesn't actually do the decompression.
I think having the underlying Seq Scan do a projection should worsen the performance. There is no way to read a single column from a standard postgres table, we have to read the entire row. There is also detoasting, but as far as I know it's done lazily. When this underlying seq scan has a targetlist that is different from physical targetlist, this node has to do extra work of projection, that is not actually needed. I'm trying to fix this here (still a draft): https://github.com/timescale/timescaledb/pull/4234
@phemmer How do you know they are being decompressed? IIRC the columns we are decompressing are in the Output of DecompressChunk node. It has only the column 'a', looks correct to me. The Output of the underlying Seq Scan doesn't matter, because this node doesn't actually do the decompression.
I guess I don't know, but it's the only explanation that made sense. Though I could be wrong. What brought me to the conclusion was that in production, I have a table which is much wider, where each chunk is several GB. Running the same query with and without a view shows a huge performance difference. Without the view it completes in a few hundred ms. With the view it takes about 10 seconds. When I do the explain
, the timing difference is all inside the DecompressChunk, with the only behavior difference shown being the extra columns. Since the slow query show the columns, and the fast one does not, I concluded that those columns were being decompressed.
Without the view it completes in a few hundred ms. With the view it takes about 10 seconds.
I tried some bigger table I have at hand, and it's the same no matter if I use a view or not. Maybe there is some other difference... Can you post the EXPLAIN for no view + compression, is it different from when using a view?
Why it uses this targetlist for the underlying Seq Scan is a mystery to me. It shouldn't change when using a simple view like that. But this targetlist also shouldn't make things slower according to what I know. We have to investigate this.
Here's a example of an un-simplified query I'm using in production (it's not the worst performing, but still very discernible):
EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
WITH tags AS NOT MATERIALIZED (
SELECT * FROM telegraf.haproxy_backend_tag
WHERE
host LIKE ANY (array['%'])
AND instance = 'myinstance'
AND proxy LIKE ANY (array[/* ... */])
)
SELECT
time_bucket('60.0s',"time") AS "time",
host||' '||proxy,
last(active_servers, time)
FROM telegraf.haproxy_backend t
INNER JOIN tags USING (tag_id)
WHERE
"time" BETWEEN '2022-04-19T10:35:50.812Z' AND '2022-04-19T17:35:50.812Z'
GROUP BY 1, host, proxy
ORDER BY 1
;
Sort (cost=65710.58..65893.58 rows=73200 width=74) (actual time=351.414..368.044 rows=183600 loops=1)
Output: (time_bucket('00:01:00'::interval, t_1."time")), (((haproxy_backend_tag.host || ' '::text) || haproxy_backend_tag.proxy)), (last(t_1.active_servers, t_1."time")), haproxy_backend_tag.host, haproxy_backend_tag.proxy
Sort Key: (time_bucket('00:01:00'::interval, t_1."time"))
Sort Method: external merge Disk: 15096kB
Buffers: shared hit=27032, temp read=3586 written=7077
-> HashAggregate (cost=47993.82..59796.18 rows=73200 width=74) (actual time=165.401..260.659 rows=183600 loops=1)
Output: (time_bucket('00:01:00'::interval, t_1."time")), ((haproxy_backend_tag.host || ' '::text) || haproxy_backend_tag.proxy), last(t_1.active_servers, t_1."time"), haproxy_backend_tag.host, haproxy_backend_tag.proxy
Group Key: time_bucket('00:01:00'::interval, t_1."time"), haproxy_backend_tag.host, haproxy_backend_tag.proxy
Planned Partitions: 64 Batches: 65 Memory Usage: 1137kB Disk Usage: 28432kB
Buffers: shared hit=27032, temp read=1699 written=5189
-> Nested Loop (cost=1002.61..20741.90 rows=529324 width=50) (actual time=0.312..97.352 rows=183600 loops=1)
Output: time_bucket('00:01:00'::interval, t_1."time"), haproxy_backend_tag.host, haproxy_backend_tag.proxy, t_1.active_servers, t_1."time"
Buffers: shared hit=27032
-> Gather (cost=1000.00..6349.51 rows=371 width=34) (actual time=0.276..1.093 rows=1095 loops=1)
Output: haproxy_backend_tag.host, haproxy_backend_tag.proxy, haproxy_backend_tag.tag_id
Workers Planned: 1
Workers Launched: 1
Buffers: shared hit=2102
-> Parallel Seq Scan on telegraf.haproxy_backend_tag (cost=0.00..5312.41 rows=218 width=34) (actual time=0.033..15.065 rows=548 loops=2)
Output: haproxy_backend_tag.host, haproxy_backend_tag.proxy, haproxy_backend_tag.tag_id
Filter: ((haproxy_backend_tag.host ~~ ANY ('{%}'::text[])) AND (haproxy_backend_tag.instance = 'myinstance'::text) AND (haproxy_backend_tag.proxy ~~ ANY ('{admin_haproxy_com,adn-demo.haproxy.com,adn-ip-list,botdetection_css,botdetection_js,c3811-gsp_vault-acme_dcv,c3811_activate_haproxy,c3811_ui_priv,c3811_www,c3811_www_dev,c3811_www_download,demo.haproxy.net,dev.haproxy.com-anycast-dispatch,dev_haproxy_com,eleph.haproxy.com,haproxy.com,haproxy.com-anycast-dispatch,haproxy.com-priv-anycast-dispatch,http01.acme-dcv.haproxy.com,iplookup_test,kubernetes-registry.haproxy.com,per_ip_post_rate,redmine,waf_blocked,waf_blocked_report,waf_falsepositive_form,waf_report}'::text[])))
Rows Removed by Filter: 59712
Buffers: shared hit=2102
Worker 0: actual time=0.039..29.481 rows=1055 loops=1
Buffers: shared hit=2044
-> Append (cost=2.61..15.23 rows=2000 width=24) (actual time=0.012..0.067 rows=168 loops=1095)
Buffers: shared hit=24930
-> Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_21_29523_chunk t_1 (cost=2.61..2.61 rows=1000 width=24) (actual time=0.012..0.028 rows=78 loops=1095)
Output: t_1."time", t_1.active_servers, t_1.tag_id
Filter: ((t_1."time" >= '2022-04-19 14:35:50.812'::timestamp without time zone) AND (t_1."time" <= '2022-04-19 17:35:50.812'::timestamp without time zone) AND (haproxy_backend_tag.tag_id = t_1.tag_id))
Rows Removed by Filter: 145
Buffers: shared hit=12465
-> Index Scan using compress_hyper_22_29578_chunk__compressed_hypertable_22_tag_id_ on _timescaledb_internal.compress_hyper_22_29578_chunk (cost=0.42..2.61 rows=1 width=92) (actual time=0.004..0.004 rows=1 loops=1095)
Output: compress_hyper_22_29578_chunk._ts_meta_count, compress_hyper_22_29578_chunk."time", compress_hyper_22_29578_chunk.tag_id, compress_hyper_22_29578_chunk.active_servers
Index Cond: (compress_hyper_22_29578_chunk.tag_id = haproxy_backend_tag.tag_id)
Filter: ((compress_hyper_22_29578_chunk._ts_meta_max_1 >= '2022-04-19 14:35:50.812'::timestamp without time zone) AND (compress_hyper_22_29578_chunk._ts_meta_min_1 <= '2022-04-19 17:35:50.812'::timestamp without time zone))
Buffers: shared hit=4305
-> Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_21_29564_chunk t_2 (cost=2.61..2.61 rows=1000 width=24) (actual time=0.019..0.029 rows=89 loops=1095)
Output: t_2."time", t_2.active_servers, t_2.tag_id
Filter: ((t_2."time" >= '2022-04-19 14:35:50.812'::timestamp without time zone) AND (t_2."time" <= '2022-04-19 17:35:50.812'::timestamp without time zone) AND (haproxy_backend_tag.tag_id = t_2.tag_id))
Rows Removed by Filter: 134
Buffers: shared hit=12465
-> Index Scan using compress_hyper_22_29616_chunk__compressed_hypertable_22_tag_id_ on _timescaledb_internal.compress_hyper_22_29616_chunk (cost=0.42..2.61 rows=1 width=92) (actual time=0.004..0.004 rows=1 loops=1095)
Output: compress_hyper_22_29616_chunk._ts_meta_count, compress_hyper_22_29616_chunk."time", compress_hyper_22_29616_chunk.tag_id, compress_hyper_22_29616_chunk.active_servers
Index Cond: (compress_hyper_22_29616_chunk.tag_id = haproxy_backend_tag.tag_id)
Filter: ((compress_hyper_22_29616_chunk._ts_meta_max_1 >= '2022-04-19 14:35:50.812'::timestamp without time zone) AND (compress_hyper_22_29616_chunk._ts_meta_min_1 <= '2022-04-19 17:35:50.812'::timestamp without time zone))
Buffers: shared hit=4305
Query Identifier: -8684628961912603055
Planning:
Buffers: shared hit=207
Planning Time: 1.285 ms
Execution Time: 383.932 ms
EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
SELECT
time_bucket('60.0s',"time") AS "time",
host||' '||proxy,
last(active_servers, time)
FROM c3811.haproxy_backend
WHERE
"time" BETWEEN '2022-04-19T10:35:50.812Z' AND '2022-04-19T17:35:50.812Z'
AND host LIKE ANY (array['%'])
AND instance = 'myinstance'
AND proxy LIKE ANY (array[/* ... */])
GROUP BY 1, host, proxy
ORDER BY 1
;
View definition:
CREATE OR REPLACE VIEW c3811.haproxy_backend AS
WITH tags AS NOT MATERIALIZED (
SELECT *
FROM telegraf.haproxy_backend_tag
WHERE instance LIKE 'haproxy-c3811-%'
)
SELECT *
FROM tags
INNER JOIN telegraf.haproxy_backend t USING (tag_id)
Sort (cost=6831.13..6833.96 rows=1129 width=74) (actual time=1252.511..1270.936 rows=183600 loops=1)
Output: (time_bucket('00:01:00'::interval, haproxy_backend."time")), (((haproxy_backend.host || ' '::text) || haproxy_backend.proxy)), (last(haproxy_backend.active_servers, haproxy_backend."time")), haproxy_backend.host, haproxy_backend.proxy
Sort Key: (time_bucket('00:01:00'::interval, haproxy_backend."time"))
Sort Method: external merge Disk: 15096kB
Buffers: shared hit=447432, temp read=3557 written=5052
-> HashAggregate (cost=6751.31..6773.89 rows=1129 width=74) (actual time=948.598..1131.498 rows=183600 loops=1)
Output: (time_bucket('00:01:00'::interval, haproxy_backend."time")), ((haproxy_backend.host || ' '::text) || haproxy_backend.proxy), last(haproxy_backend.active_servers, haproxy_backend."time"), haproxy_backend.host, haproxy_backend.proxy
Group Key: time_bucket('00:01:00'::interval, haproxy_backend."time"), haproxy_backend.host, haproxy_backend.proxy
Batches: 5 Memory Usage: 9753kB Disk Usage: 15424kB
Buffers: shared hit=447432, temp read=1670 written=3164
-> Subquery Scan on haproxy_backend (cost=1002.64..6694.24 rows=5707 width=50) (actual time=0.828..863.932 rows=183600 loops=1)
Output: time_bucket('00:01:00'::interval, haproxy_backend."time"), haproxy_backend.host, haproxy_backend.proxy, haproxy_backend.active_servers, haproxy_backend."time"
Buffers: shared hit=447432
-> Nested Loop (cost=1002.64..6622.90 rows=5707 width=802) (actual time=0.826..835.058 rows=183600 loops=1)
Output: NULL::bigint, NULL::text, haproxy_backend_tag.host, NULL::text, haproxy_backend_tag.proxy, NULL::text, NULL::text, t_1."time", t_1.active_servers, NULL::bigint, NULL::bigint, NULL::text, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::text, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::text, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::double precision, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint, NULL::bigint
Buffers: shared hit=447432
-> Gather (cost=1000.00..6481.78 rows=4 width=34) (actual time=0.515..1.088 rows=1095 loops=1)
Output: haproxy_backend_tag.host, haproxy_backend_tag.proxy, haproxy_backend_tag.tag_id
Workers Planned: 1
Workers Launched: 1
Buffers: shared hit=2102
-> Parallel Seq Scan on telegraf.haproxy_backend_tag (cost=0.00..5481.38 rows=2 width=34) (actual time=0.056..16.298 rows=548 loops=2)
Output: haproxy_backend_tag.host, haproxy_backend_tag.proxy, haproxy_backend_tag.tag_id
Filter: ((haproxy_backend_tag.host ~~ ANY ('{%}'::text[])) AND (haproxy_backend_tag.instance ~~ 'haproxy-c3811-%'::text) AND (haproxy_backend_tag.instance = 'myinstance'::text) AND (haproxy_backend_tag.proxy ~~ ANY ('{admin_haproxy_com,adn-demo.haproxy.com,adn-ip-list,botdetection_css,botdetection_js,c3811-gsp_vault-acme_dcv,c3811_activate_haproxy,c3811_ui_priv,c3811_www,c3811_www_dev,c3811_www_download,demo.haproxy.net,dev.haproxy.com-anycast-dispatch,dev_haproxy_com,eleph.haproxy.com,haproxy.com,haproxy.com-anycast-dispatch,haproxy.com-priv-anycast-dispatch,http01.acme-dcv.haproxy.com,iplookup_test,kubernetes-registry.haproxy.com,per_ip_post_rate,redmine,waf_blocked,waf_blocked_report,waf_falsepositive_form,waf_report}'::text[])))
Rows Removed by Filter: 59712
Buffers: shared hit=2102
Worker 0: actual time=0.061..32.209 rows=1090 loops=1
Buffers: shared hit=2086
-> Append (cost=2.64..15.28 rows=2000 width=24) (actual time=0.155..0.686 rows=168 loops=1095)
Buffers: shared hit=445330
-> Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_21_29523_chunk t_1 (cost=2.64..2.64 rows=1000 width=24) (actual time=0.154..0.335 rows=78 loops=1095)
Output: t_1."time", t_1.active_servers, t_1.tag_id
Filter: ((t_1."time" >= '2022-04-19 14:35:50.812'::timestamp without time zone) AND (t_1."time" <= '2022-04-19 17:35:50.812'::timestamp without time zone) AND (haproxy_backend_tag.tag_id = t_1.tag_id))
Rows Removed by Filter: 145
Buffers: shared hit=222665
-> Index Scan using compress_hyper_22_29578_chunk__compressed_hypertable_22_tag_id_ on _timescaledb_internal.compress_hyper_22_29578_chunk (cost=0.42..2.64 rows=1 width=92) (actual time=0.005..0.005 rows=1 loops=1095)
Output: compress_hyper_22_29578_chunk._ts_meta_count, compress_hyper_22_29578_chunk."time", compress_hyper_22_29578_chunk.tag_id, compress_hyper_22_29578_chunk.active_servers, compress_hyper_22_29578_chunk.agg_server_check_status, compress_hyper_22_29578_chunk.backup_servers, compress_hyper_22_29578_chunk.cookie, compress_hyper_22_29578_chunk.ctime, compress_hyper_22_29578_chunk.ctime_max, compress_hyper_22_29578_chunk.h2_backend_open_streams, compress_hyper_22_29578_chunk.h2_backend_total_streams, compress_hyper_22_29578_chunk.h2_data_rcvd, compress_hyper_22_29578_chunk.h2_detected_conn_protocol_errors, compress_hyper_22_29578_chunk.h2_detected_strm_protocol_errors, compress_hyper_22_29578_chunk.h2_goaway_rcvd, compress_hyper_22_29578_chunk.h2_goaway_resp, compress_hyper_22_29578_chunk.h2_headers_rcvd, compress_hyper_22_29578_chunk.h2_open_connections, compress_hyper_22_29578_chunk.h2_rst_stream_rcvd, compress_hyper_22_29578_chunk.h2_rst_stream_resp, compress_hyper_22_29578_chunk.h2_settings_rcvd, compress_hyper_22_29578_chunk.h2_total_connections, compress_hyper_22_29578_chunk.iid, compress_hyper_22_29578_chunk.lastchg, compress_hyper_22_29578_chunk.lastsess, compress_hyper_22_29578_chunk.mode, compress_hyper_22_29578_chunk.pid, compress_hyper_22_29578_chunk.qcur, compress_hyper_22_29578_chunk.qmax, compress_hyper_22_29578_chunk.qtime, compress_hyper_22_29578_chunk.qtime_max, compress_hyper_22_29578_chunk.rate, compress_hyper_22_29578_chunk.rate_max, compress_hyper_22_29578_chunk.reuse, compress_hyper_22_29578_chunk.rtime, compress_hyper_22_29578_chunk.rtime_max, compress_hyper_22_29578_chunk.scur, compress_hyper_22_29578_chunk.sid, compress_hyper_22_29578_chunk.slim, compress_hyper_22_29578_chunk.smax, compress_hyper_22_29578_chunk.srv_abort, compress_hyper_22_29578_chunk.ssl_failed_handshake, compress_hyper_22_29578_chunk.ssl_reused_sess, compress_hyper_22_29578_chunk.ssl_sess, compress_hyper_22_29578_chunk.status, compress_hyper_22_29578_chunk.stot, compress_hyper_22_29578_chunk.ttime, compress_hyper_22_29578_chunk.ttime_max, compress_hyper_22_29578_chunk.uweight, compress_hyper_22_29578_chunk.weight, compress_hyper_22_29578_chunk.wredis, compress_hyper_22_29578_chunk.wretr, compress_hyper_22_29578_chunk.wrew, compress_hyper_22_29578_chunk.bin, compress_hyper_22_29578_chunk.bout, compress_hyper_22_29578_chunk.cache_hits, compress_hyper_22_29578_chunk.cache_lookups, compress_hyper_22_29578_chunk.chkdown, compress_hyper_22_29578_chunk.cli_abort, compress_hyper_22_29578_chunk.connect, compress_hyper_22_29578_chunk.downtime, compress_hyper_22_29578_chunk.duration, compress_hyper_22_29578_chunk.econ, compress_hyper_22_29578_chunk.eint, compress_hyper_22_29578_chunk.eresp, compress_hyper_22_29578_chunk."http_response.1xx", compress_hyper_22_29578_chunk."http_response.2xx", compress_hyper_22_29578_chunk."http_response.3xx", compress_hyper_22_29578_chunk."http_response.4xx", compress_hyper_22_29578_chunk."http_response.5xx", compress_hyper_22_29578_chunk."http_response.other", compress_hyper_22_29578_chunk.lbtot, compress_hyper_22_29578_chunk.req_tot
Index Cond: (compress_hyper_22_29578_chunk.tag_id = haproxy_backend_tag.tag_id)
Filter: ((compress_hyper_22_29578_chunk._ts_meta_max_1 >= '2022-04-19 14:35:50.812'::timestamp without time zone) AND (compress_hyper_22_29578_chunk._ts_meta_min_1 <= '2022-04-19 17:35:50.812'::timestamp without time zone))
Buffers: shared hit=4305
-> Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_21_29564_chunk t_2 (cost=2.64..2.64 rows=1000 width=24) (actual time=0.257..0.336 rows=89 loops=1095)
Output: t_2."time", t_2.active_servers, t_2.tag_id
Filter: ((t_2."time" >= '2022-04-19 14:35:50.812'::timestamp without time zone) AND (t_2."time" <= '2022-04-19 17:35:50.812'::timestamp without time zone) AND (haproxy_backend_tag.tag_id = t_2.tag_id))
Rows Removed by Filter: 134
Buffers: shared hit=222665
-> Index Scan using compress_hyper_22_29616_chunk__compressed_hypertable_22_tag_id_ on _timescaledb_internal.compress_hyper_22_29616_chunk (cost=0.42..2.64 rows=1 width=92) (actual time=0.005..0.005 rows=1 loops=1095)
Output: compress_hyper_22_29616_chunk._ts_meta_count, compress_hyper_22_29616_chunk."time", compress_hyper_22_29616_chunk.tag_id, compress_hyper_22_29616_chunk.active_servers, compress_hyper_22_29616_chunk.agg_server_check_status, compress_hyper_22_29616_chunk.backup_servers, compress_hyper_22_29616_chunk.cookie, compress_hyper_22_29616_chunk.ctime, compress_hyper_22_29616_chunk.ctime_max, compress_hyper_22_29616_chunk.h2_backend_open_streams, compress_hyper_22_29616_chunk.h2_backend_total_streams, compress_hyper_22_29616_chunk.h2_data_rcvd, compress_hyper_22_29616_chunk.h2_detected_conn_protocol_errors, compress_hyper_22_29616_chunk.h2_detected_strm_protocol_errors, compress_hyper_22_29616_chunk.h2_goaway_rcvd, compress_hyper_22_29616_chunk.h2_goaway_resp, compress_hyper_22_29616_chunk.h2_headers_rcvd, compress_hyper_22_29616_chunk.h2_open_connections, compress_hyper_22_29616_chunk.h2_rst_stream_rcvd, compress_hyper_22_29616_chunk.h2_rst_stream_resp, compress_hyper_22_29616_chunk.h2_settings_rcvd, compress_hyper_22_29616_chunk.h2_total_connections, compress_hyper_22_29616_chunk.iid, compress_hyper_22_29616_chunk.lastchg, compress_hyper_22_29616_chunk.lastsess, compress_hyper_22_29616_chunk.mode, compress_hyper_22_29616_chunk.pid, compress_hyper_22_29616_chunk.qcur, compress_hyper_22_29616_chunk.qmax, compress_hyper_22_29616_chunk.qtime, compress_hyper_22_29616_chunk.qtime_max, compress_hyper_22_29616_chunk.rate, compress_hyper_22_29616_chunk.rate_max, compress_hyper_22_29616_chunk.reuse, compress_hyper_22_29616_chunk.rtime, compress_hyper_22_29616_chunk.rtime_max, compress_hyper_22_29616_chunk.scur, compress_hyper_22_29616_chunk.sid, compress_hyper_22_29616_chunk.slim, compress_hyper_22_29616_chunk.smax, compress_hyper_22_29616_chunk.srv_abort, compress_hyper_22_29616_chunk.ssl_failed_handshake, compress_hyper_22_29616_chunk.ssl_reused_sess, compress_hyper_22_29616_chunk.ssl_sess, compress_hyper_22_29616_chunk.status, compress_hyper_22_29616_chunk.stot, compress_hyper_22_29616_chunk.ttime, compress_hyper_22_29616_chunk.ttime_max, compress_hyper_22_29616_chunk.uweight, compress_hyper_22_29616_chunk.weight, compress_hyper_22_29616_chunk.wredis, compress_hyper_22_29616_chunk.wretr, compress_hyper_22_29616_chunk.wrew, compress_hyper_22_29616_chunk.bin, compress_hyper_22_29616_chunk.bout, compress_hyper_22_29616_chunk.cache_hits, compress_hyper_22_29616_chunk.cache_lookups, compress_hyper_22_29616_chunk.chkdown, compress_hyper_22_29616_chunk.cli_abort, compress_hyper_22_29616_chunk.connect, compress_hyper_22_29616_chunk.downtime, compress_hyper_22_29616_chunk.duration, compress_hyper_22_29616_chunk.econ, compress_hyper_22_29616_chunk.eint, compress_hyper_22_29616_chunk.eresp, compress_hyper_22_29616_chunk."http_response.1xx", compress_hyper_22_29616_chunk."http_response.2xx", compress_hyper_22_29616_chunk."http_response.3xx", compress_hyper_22_29616_chunk."http_response.4xx", compress_hyper_22_29616_chunk."http_response.5xx", compress_hyper_22_29616_chunk."http_response.other", compress_hyper_22_29616_chunk.lbtot, compress_hyper_22_29616_chunk.req_tot
Index Cond: (compress_hyper_22_29616_chunk.tag_id = haproxy_backend_tag.tag_id)
Filter: ((compress_hyper_22_29616_chunk._ts_meta_max_1 >= '2022-04-19 14:35:50.812'::timestamp without time zone) AND (compress_hyper_22_29616_chunk._ts_meta_min_1 <= '2022-04-19 17:35:50.812'::timestamp without time zone))
Buffers: shared hit=4305
Query Identifier: 6896141023051905804
Planning:
Buffers: shared hit=208
Planning Time: 1.956 ms
Execution Time: 1378.840 ms
OK, I think I can reproduce it and a view is indeed slower because it decompresses all the columns, even when the EXPLAIN is completely the same.
Probably the reason is here https://github.com/timescale/timescaledb/blob/main/tsl/src/nodes/decompress_chunk/planner.c#L130
We use the RangeTblEntry.selectedCols
to determine which columns to decompress, but it's wrong. These are the columns that need select permissions. Because the view does 'SELECT *', it needs permissions for all the columns, so we think we have to decompress them. I'll see how it can be fixed.
What type of bug is this?
Performance issue
What subsystems and features are affected?
Compression
What happened?
When using a view, data retrieval from compressed chunks returns all columns regardless of whether they are used in the query. This causes significant performance impact.
TimescaleDB version affected
2.6.0
PostgreSQL version used
14.2
What operating system did you use?
docker image timescale/timescaledb:2.6.0-pg14
What installation method did you use?
Docker
What platform did you run on?
On prem/Self-hosted
Relevant log output and stack trace
No response
How can we reproduce the bug?
Note the
Output:
line for each chunk, and how it's only outputting columna
. This is the correct behavior.Compress a chunk and try again:
Notice how it's now outputting all columns on the compressed chunk.