zubkov-andrei / pg_profile

Postgres historic workload reports
Other
212 stars 31 forks source link

duplicate key on index pk_sample_statements_n while executing take_snapshot #38

Closed portnov closed 2 years ago

portnov commented 2 years ago

While executing select * from profile.take_snapshot(), I got

...
PL/pgSQL function collect_pg_stat_statements_stats(jsonb,integer,integer,integer) line 598 at SQL statement
SQL statement ""SELECT collect_pg_stat_statements_stats(server_properties, sserver_id, s_id, topn)""
PL/pgSQL function take_sample(integer,boolean) line 646 at PERFORM
PL/pgSQL function take_sample_subset(integer,integer) line 27 at assignment
SQL function ""take_sample"" statement 1
Key (server_id, sample_id, datid, userid, queryid)=(1, 10, 24797, 16385, 4955406013092116682) already exists.",00:00:00.09

While trying to investigate, I found out that in pg_stat_statements there are two records with the same queryid=4955406013092116682. One of them was EXPLAIN select..., with toplevel = f, and another was that same select, with toplevel = t. And yes, I have pg_stat_statements.track = all, but your documentation vaguely says that "all" is a problem only with Postgres before 14, while I have 14.3.

I'm not sure if this an issue with pg_profile or with postgres itself? Should it be possible that two technically different queries have the same queryid? and why EXPLAIN has toplevel=false, while underlaying query has toplevel=true?...

If it matters, that is simply EXPLAIN, not explain analyze. At first I thought that simple select appeared in pg_stat_statement as a part of executing explain. But since it was not explain analyze, I'm not sure. Maybe it really somehow appeared that explain was executed as a nested query, and the same query as the one being explained was issued by totally different reason...

zubkov-andrei commented 2 years ago

Hello! Yes, this is a bug. Duplicate of #37. Will be fixed in next release.

portnov commented 2 years ago

Thanks for such a fast answer! :)