CrunchyData / crunchy-containers

Containers for Managing PostgreSQL on Kubernetes by Crunchy Data
https://www.crunchydata.com/
Apache License 2.0
1.01k stars 328 forks source link

Memory issues caused by crunchy-postgres-exporter #1381

Closed schewara closed 3 years ago

schewara commented 3 years ago

Describe the bug

When running a standalone postgres with monitoring enabled causes the memory usage to grow permanently until the configured limit is reached.

The affected Metrics are ccp_nodemx_mem_active_anon and ccp_nodemx_mem_rss

Restarting the postgres-exporter free's up the used memory immediately

Further Investigation showed that a specific Query took permanently around 600ms

postgres=# explain analyze WITH all_backups AS ( SELECT config_file , jsonb_array_elements(data) AS stanza_data FROM monitor.pgbackrest_info(10) ) , per_stanza AS ( SELECT config_file , stanza_data->>'name' AS stanza , jsonb_array_elements(stanza_data->'backup') AS backup_data FROM all_backups ) SELECT a.config_file , a.stanza , a.backup_data->>'type' AS backup_type , a.backup_data->'info'->'repository'->>'delta' AS repo_backup_size_bytes , a.backup_data->'info'->'repository'->>'size' AS repo_total_size_bytes , (a.backup_data->'timestamp'->>'stop')::bigint - (a.backup_data->'timestamp'->>'start')::bigint AS backup_runtime_seconds FROM per_stanza a JOIN ( SELECT config_file , stanza , backup_data->>'type' AS backup_type , max(backup_data->'timestamp'->>'start') AS max_backup_start , max(backup_data->'timestamp'->>'stop') AS max_backup_stop FROM per_stanza GROUP BY 1,2,3) b ON a.config_file = b.config_file AND a.stanza = b.stanza AND a.backup_data->>'type' = b.backup_type AND a.backup_data->'timestamp'->>'start' = b.max_backup_start AND a.backup_data->'timestamp'->>'stop' = b.max_backup_stop;
                                                                                                                                                                                                                            QUERY PLAN                                            

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=2626697.50..3279629.74 rows=31 width=168) (actual time=630.971..631.104 rows=2 loops=1)
   Hash Cond: ((a.config_file = per_stanza.config_file) AND (a.stanza = per_stanza.stanza) AND ((a.backup_data ->> 'type'::text) = ((per_stanza.backup_data ->> 'type'::text))) AND (((a.backup_data -> 'timestamp'::text) ->> 'start'::text) = (max(((per_stanza.backup_data -> '
timestamp'::text) ->> 'start'::text)))) AND (((a.backup_data -> 'timestamp'::text) ->> 'stop'::text) = (max(((per_stanza.backup_data -> 'timestamp'::text) ->> 'stop'::text)))))
   CTE all_backups
     ->  ProjectSet  (cost=0.25..517.75 rows=100000 width=64) (actual time=630.327..630.328 rows=1 loops=1)
           ->  Function Scan on pgbackrest_info  (cost=0.25..10.25 rows=1000 width=64) (actual time=630.308..630.308 rows=1 loops=1)
   CTE per_stanza
     ->  ProjectSet  (cost=0.00..53250.00 rows=10000000 width=96) (actual time=630.337..630.340 rows=6 loops=1)
           ->  CTE Scan on all_backups  (cost=0.00..2000.00 rows=100000 width=64) (actual time=630.329..630.330 rows=1 loops=1)
   ->  CTE Scan on per_stanza a  (cost=0.00..200000.00 rows=10000000 width=96) (actual time=630.339..630.340 rows=6 loops=1)
   ->  Hash  (cost=2527968.75..2527968.75 rows=1000000 width=160) (actual time=0.208..0.208 rows=2 loops=1)
         Buckets: 65536  Batches: 32  Memory Usage: 512kB
         ->  HashAggregate  (cost=2134375.00..2517968.75 rows=1000000 width=160) (actual time=0.072..0.104 rows=2 loops=1)
               Group Key: per_stanza.config_file, per_stanza.stanza, (per_stanza.backup_data ->> 'type'::text)
               Planned Partitions: 64  Batches: 1  Memory Usage: 409kB
               ->  CTE Scan on per_stanza  (cost=0.00..225000.00 rows=10000000 width=128) (actual time=0.002..0.009 rows=6 loops=1)
 Planning Time: 0.267 ms
 JIT:
   Functions: 38
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 7.328 ms, Inlining 16.180 ms, Optimization 379.258 ms, Emission 234.166 ms, Total 636.932 ms
 Execution Time: 638.649 ms
(21 rows)

After further Investigation I stumbled upon

and disabling jit massively improved the execution time

postgres=# set jit=off;
SET

postgres=# explain analyze WITH all_backups AS ( SELECT config_file , jsonb_array_elements(data) AS stanza_data FROM monitor.pgbackrest_info(10) ) , per_stanza AS ( SELECT config_file , stanza_data->>'name' AS stanza , jsonb_array_elements(stanza_data->'backup') AS backup_data FROM all_backups ) SELECT a.config_file , a.stanza , a.backup_data->>'type' AS backup_type , a.backup_data->'info'->'repository'->>'delta' AS repo_backup_size_bytes , a.backup_data->'info'->'repository'->>'size' AS repo_total_size_bytes , (a.backup_data->'timestamp'->>'stop')::bigint - (a.backup_data->'timestamp'->>'start')::bigint AS backup_runtime_seconds FROM per_stanza a JOIN ( SELECT config_file , stanza , backup_data->>'type' AS backup_type , max(backup_data->'timestamp'->>'start') AS max_backup_start , max(backup_data->'timestamp'->>'stop') AS max_backup_stop FROM per_stanza GROUP BY 1,2,3) b ON a.config_file = b.config_file AND a.stanza = b.stanza AND a.backup_data->>'type' = b.backup_type AND a.backup_data->'timestamp'->>'start' = b.max_backup_start AND a.backup_data->'timestamp'->>'stop' = b.max_backup_stop;
                                                                                                                                                                                                                            QUERY PLAN                                            

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=2626697.50..3279629.74 rows=31 width=168) (actual time=1.019..1.232 rows=2 loops=1)
   Hash Cond: ((a.config_file = per_stanza.config_file) AND (a.stanza = per_stanza.stanza) AND ((a.backup_data ->> 'type'::text) = ((per_stanza.backup_data ->> 'type'::text))) AND (((a.backup_data -> 'timestamp'::text) ->> 'start'::text) = (max(((per_stanza.backup_data -> '
timestamp'::text) ->> 'start'::text)))) AND (((a.backup_data -> 'timestamp'::text) ->> 'stop'::text) = (max(((per_stanza.backup_data -> 'timestamp'::text) ->> 'stop'::text)))))
   CTE all_backups
     ->  ProjectSet  (cost=0.25..517.75 rows=100000 width=64) (actual time=0.405..0.406 rows=1 loops=1)
           ->  Function Scan on pgbackrest_info  (cost=0.25..10.25 rows=1000 width=64) (actual time=0.381..0.382 rows=1 loops=1)
   CTE per_stanza
     ->  ProjectSet  (cost=0.00..53250.00 rows=10000000 width=96) (actual time=0.415..0.419 rows=6 loops=1)
           ->  CTE Scan on all_backups  (cost=0.00..2000.00 rows=100000 width=64) (actual time=0.407..0.408 rows=1 loops=1)
   ->  CTE Scan on per_stanza a  (cost=0.00..200000.00 rows=10000000 width=96) (actual time=0.417..0.419 rows=6 loops=1)
   ->  Hash  (cost=2527968.75..2527968.75 rows=1000000 width=160) (actual time=0.246..0.247 rows=2 loops=1)
         Buckets: 65536  Batches: 32  Memory Usage: 512kB
         ->  HashAggregate  (cost=2134375.00..2517968.75 rows=1000000 width=160) (actual time=0.104..0.128 rows=2 loops=1)
               Group Key: per_stanza.config_file, per_stanza.stanza, (per_stanza.backup_data ->> 'type'::text)
               Planned Partitions: 64  Batches: 1  Memory Usage: 409kB
               ->  CTE Scan on per_stanza  (cost=0.00..225000.00 rows=10000000 width=128) (actual time=0.002..0.012 rows=6 loops=1)
 Planning Time: 0.237 ms
 Execution Time: 1.374 ms
(17 rows)

In the light of these findings, it would be good to also disable jit in the postgresql.conf templates.

I am currently testing the setup with jit disabled and will know within the next day, if the memory consumption keeps stable.

To Reproduce Steps to reproduce the behavior:

  1. start a postgres container with monitoring enabled
  2. start an exporter container with the database configured
  3. request the metrics endpoint, or have Prometheus do that for you
  4. Watch the metrics rise

Expected behavior Memory consumption should be at a stable level

Screenshots will follow

Please tell us about your environment:

schewara commented 3 years ago

Screenshot of the Memory Graph crunchy-pg-memory-stats

jkatz commented 3 years ago

Please see #1309 which links to a full explanation.