Closed alzamonn closed 1 year ago
Thank you for the report. Can you please share a CPU profile:
curl -o cpu_profile.tgz 'http://<exporter_ip>:<exporter_port>/debug/pprof/profile?seconds=60'
report attached. thanks for reply cpu_profile1714.zip
also adding more info
/usr/bin/coroot-pg-agent --version
1.2.0
metric request does not work and freezes in this moment
curl http://localhost:38888/metrics > metrics1.txt
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- **0:04:41** --:--:-- 0
Looks like there are very large queries in pg_stat_statements
.
Let's check this out:
select count(1), max(length(query)) from pg_stat_statements
hi! I checked.
first db
postgres=# select count(1), max(length(query)) from pg_stat_statements;
count | max
-------+--------
4773 | 243392
(1 row)
second db
postgres=# select count(1), max(length(query)) from pg_stat_statements;
count | max
-------+--------
4771 | 630379
(1 row)
for instance, other random db in our environment without problem for now
postgres=# select count(1), max(length(query)) from pg_stat_statements;
count | max
-------+-------
2279 | 24206
(1 row)
Yeah, 630kb is a lot. We definitely should trim such long queries. Will fix it ASAP.
@alzamonn, please update to coroot-pg-agent:1.2.1
Hey guys, pg_stat_statements treats different amount of parameters like different queries:
9.6 DELETE FROM table WHERE id1 IN (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?); DELETE FROM table WHERE id1 IN (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?);
Fixed in PG 10 https://github.com/postgres/postgres/commit/83f2061dd037477ec8479ee160367840e203a722 Some attempts to fix https://stackoverflow.com/questions/74220833/how-to-make-pg-stat-statement-merge-queries-with-variable-number-of-parameters-i
With 1000 ids in array paramter those will be pretty long queries.
One more item affects all PG versions still:
10+ https://github.com/postgres/postgres/commit/a6f22e83562d8b78293229587cd3d9430d16d466 INSERT INTO table (id1, id2, id3, id4, id5, id6, id7, id8, id9, id10) values ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10),($11,$12,$13,$14,$15,$16,$17,$18,$19,$20); INSERT INTO table (id1, id2, id3, id4, id5, id6, id7, id8, id9, id10) values ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10);
9.6 INSERT INTO table (id1, id2, id3, id4, id5, id6, id7, id8, id9, id10) values (?,?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?,?); INSERT INTO table (id1, id2, id3, id4, id5, id6, id7, id8, id9, id10) values (?,?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?,?);
Instead of cutting off WHERE statements in the end of the query it's better to reduce amount of parameters to a single one to make queries shorter and distinct.
For 1000+ column tables make sense to replace columns list with * to reduce query size as well, because columnw list will not affect any PG performance stats.
@dxops, in this particular case the most CPU time was spent on query text normalization. So, I don't see any other way to make CPU consumption predictable than to trim query texts.
Collapsing repetitive query arguments is already implemented in pg-agent.
Environmental Info: CentOS Linux release 7.9.2009 (Core) 3.10.0-1160.76.1.el7.x86_64 PostgreSQL 9.6.24 pg_stat_statements 1.4
RHEL7.9 5.4.17-2136.312.3.4.el7uek.x86_64 PostgreSQL 12.11 pg_stat_statements 1.7
Describe the bug: We have few postgres instance where we seeing high CPU load caused by the process coroot-pg-agent. In some situations, the server crashes. After coroot-pg-agent is stopped load on the processor immediately decreases. When we start coroot-pg-agent after about a minute, the load increases.
node with 4 cpu:
Service coroot-pg-agent running via systemd unit.
Please advise about this issue for some deep research. If you need more information please let me know.
Additional context / logs:
lsof