citusdata / citus

Distributed PostgreSQL as an extension
https://www.citusdata.com
GNU Affero General Public License v3.0
10.43k stars 662 forks source link

postgres[495645]: segfault at 0 ip 00007f318b17e1f4 sp 00007ffc7f1b15d8 error 4 in citus.so[7f318b0a4000+ee000] likely on CPU 93 (core 1, socket 1) #7603

Closed ujae7142 closed 3 months ago

ujae7142 commented 4 months ago

rocky9.3 4 sockets machine

scenario was:

"server process (PID 487952) was terminated by signal 11: Segmentation fault","Failed process was running: select ct.conname as constraint_name, a.attname as column_name, fc.relname as foreign_table_name, fns.nspname as foreign_table_schema, fa.attname as foreign_column_name from (SELECT ct.conname, ct.conrelid, ct.confrelid, ct.conkey, ct.contype, ct.confkey, generate_subscripts(ct.conkey, 1) AS s FROM pg_constraint ct ) AS ct inner join pg_class c on c.oid=ct.conrelid inner join pg_namespace ns on c.relnamespace=ns.oid inner join pg_attribute a on a.attrelid=ct.conrelid and a.attnum = ct.conkey[ct.s] left join pg_class fc on fc.oid=ct.confrelid left join pg_namespace fns on fc.relnamespace=fns.oid left join pg_attribute fa on fa.attrelid=ct.confrelid and fa.attnum = ct.confkey[ct.s] where ct.contype='f' and c.relname='table1' and ns.nspname='schemauser' order by fns.nspname, fc.relname, a.attnum ;

"terminating any other active server processes",,,,,,,,,"","postmaster",,0

OS log said: postgres[495645]: segfault at 0 ip 00007f318b17e1f4 sp 00007ffc7f1b15d8 error 4 in citus.so[7f318b0a4000+ee000] likely on CPU 93 (core 1, socket 1)

after drop the cistus extension, the segfault gone. please let me know, am I missing something after update minor update on pg?

tried drop and create extension, it was same.

themikem commented 4 months ago

Have been seeing something similar on PG 16 on WSL Ubuntu 24.04 and 22.04. Introspection crashes the DB at the select query. CPU: i9-13980HX Latest Win11, WSL 2, etc.

JelteF commented 4 months ago

Yes, this was a regression in 16.1.3 we're working on releasing a fix. See #7604

Green-Chan commented 3 months ago

The mentioned pull request is merged and version is bumped, but this issue is still open and 12.1.4 release is listed neither in https://github.com/citusdata/citus/releases nor in https://www.citusdata.com/updates/v12-1 That's confusing

gurkanindibay commented 3 months ago

Thanks @Green-Chan for the heads up. I added the release. https://github.com/citusdata/citus/releases/tag/v12.1.4 Closing the issue

ujae7142 commented 3 months ago

yesterday I updated to citus 12.1.4 and problem still exists. but the root cause is not citus, i guess due to numa or transparent_hugepage settings.

change rocky9 default boot param :

now error is gone.