Closed amaury-d closed 7 years ago
Thanks for the report!
Can you easily reproduce this issue on a non production database? If a schema-only dump and the query enough to reproduce this issue, could you provide the schema-only dump? (powa (AT) dalibo (dot) com if you need privacy).
Which version of pg_qualstats is it?
We just received another report which looks similar to this issue.
Here are the steps to reproduce the problem (need 9.6 psql for \gexec command):
CREATE TABLE ttt(id1 integer, id2 integer);
INSERT INTO ttt select i, i from generate_series(1,600) i;
with s(v) as (select string_agg(i::text || ', ' || i::text, '),(') from generate_series(1,500)i)
select 'select count(*) from ttt where (id1, id2) in (( ' || v || '))' from s;
\gexec
Commit https://github.com/dalibo/pg_qualstats/commit/82d0b086b91d5bdfffc6945ac5942f439393e095 should fix this issue. Could you confirm it also fixes your test case?
Thanks, transmitted to our Postgresql team, waiting for their return.
I'm closing this issue now, since this should now be fixed. If it's not the case, please reopen it.
Thanks again for reporting the problem.
Hello,
When I execute the following request around 10-50 times in a row, postgres hangs randomly on the connection. New connections are still working.
The CPU peaks at 100% and the process needs to be SIGKILLed. SIGTERM is not treated.
The request is:
SELECT * FROM MEASURE_FLAT_H INNER JOIN ( SELECT max(dt) as dtmax FROM MEASURE_FLAT_H WHERE (ASSETS_NAME = ?) AND (METRIC_NAME = ?) ) d ON (D.DTMAX = DT) WHERE (ASSETS_NAME = ?) AND (METRIC_NAME = ?)
At this step, no lock is found in pg_locks.The process can be found stuck in the hash_seq_search().
Call stack:
0 0x000000000075e919 in hash_seq_search ()
1 0x00007fd0af74fb2b in pgqs_entry_dealloc ()
from /tech/postgres/9.4.1/server/lib/pg_qualstats.so
2 0x00007fd0af750a15 in pgqs_ExecutorEnd ()
from /tech/postgres/9.4.1/server/lib/pg_qualstats.so
3 0x000000000055dade in PortalCleanup ()
4 0x000000000077438a in PortalDrop ()
5 0x000000000077472a in PreCommit_Portals ()
6 0x00000000004b36ec in CommitTransaction ()
7 0x00000000004b4ab5 in CommitTransactionCommand ()
8 0x000000000067a269 in finish_xact_command ()
9 0x000000000067d685 in PostgresMain ()
10 0x000000000062c246 in PostmasterMain ()
11 0x00000000005c4ab8 in main ()
Each time, I've seen the process looping infinitely from 75e910 to 75e927:
0x000000000075e910 <+96>: mov (%r9,%rcx,8),%rax 0x000000000075e914 <+100>: test %rax,%rax 0x000000000075e917 <+103>: jne 0x75e93c <hash_seq_search+140> 0x000000000075e919 <+105>: add $0x1,%edx ---Type to continue, or q to quit---
=> 0x000000000075e91c <+108>: cmp %edx,%esi
0x000000000075e91e <+110>: jb 0x75e958 <hash_seq_search+168>
0x000000000075e920 <+112>: add $0x1,%rcx
0x000000000075e924 <+116>: cmp %rcx,%r8
0x000000000075e927 <+119>: jg 0x75e910 <hash_seq_search+96>
For information, other hangs was also met in the past with other requests in similar situations but I don't have traces and can't confirm the issue was also in hash_seq_search().
Best regards,
Amaury