pgroonga / pgroonga

PGroonga is a PostgreSQL extension to use Groonga as index. PGroonga makes PostgreSQL fast full text search platform for all languages!
https://pgroonga.github.io/
Other
554 stars 23 forks source link

Regression: PGroonga crashes PostgreSQL on CPU without AVX2 #578

Open khalilsadiq786 opened 1 week ago

khalilsadiq786 commented 1 week ago

What happened?

With PGroonga installed from the PPA on Ubuntu 22.04, as of 3.2.0 it worked correctly even on very old systems without AVX of any kind. However, sometime between 3.2.0 and 3.2.4 this appears to have changed. On 3.2.4, any queries which attempt to use PGroonga at all will crash PostgreSQL with an "Illegal instruction" error if the CPU does not support AVX2.

I do not see this documented anywhere, so believe it must be unintentional. Probably some compiler flag was accidentally changed somewhere in the build process.

Working on 3.2.0, not working on 3.2.4:

Working on 3.2.4:

How to reproduce it

Attempt to use PGroonga on CPU without AVX2.

Expected behavior

PGroonga should work on these CPUs as it used to.

Environment

Additional context

No response

komainu8 commented 1 week ago

Could you provide PGroonga's log and PostgreSQL's log when PGroonga crashed?

PGroonga's log exist in /var/lib/postgresql/14/main/pgroonga.log in default. PostgreSQL's log exist in /var/log/postgresql in default.

khalilsadiq786 commented 1 week ago

Here's an example -- any query that touches PGroonga at all will trigger this (including a SELECT * FROM table query on a table which has a PGroonga index on it).

PostgreSQL log:

2024-11-14 07:50:17.691 UTC [29764] LOG:  starting PostgreSQL 14.13 (Ubuntu 14.13-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
2024-11-14 07:50:17.692 UTC [29764] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2024-11-14 07:50:17.692 UTC [29764] LOG:  listening on IPv6 address "::", port 5432
2024-11-14 07:50:17.704 UTC [29764] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2024-11-14 07:50:17.746 UTC [29765] LOG:  database system was shut down at 2024-11-14 07:50:11 UTC
2024-11-14 07:50:17.810 UTC [29764] LOG:  database system is ready to accept connections
2024-11-14 07:50:34.856 UTC [29764] LOG:  server process (PID 29786) was terminated by signal 4: Illegal instruction
2024-11-14 07:50:34.856 UTC [29764] DETAIL:  Failed process was running: CREATE EXTENSION pgroonga;
2024-11-14 07:50:34.857 UTC [29764] LOG:  terminating any other active server processes
2024-11-14 07:50:34.866 UTC [29788] postgres@postgres FATAL:  the database system is in recovery mode
2024-11-14 07:50:34.878 UTC [29764] LOG:  all server processes terminated; reinitializing
2024-11-14 07:50:35.105 UTC [29789] LOG:  database system was interrupted; last known up at 2024-11-14 07:50:17 UTC
2024-11-14 07:50:36.004 UTC [29789] LOG:  database system was not properly shut down; automatic recovery in progress
2024-11-14 07:50:36.040 UTC [29789] LOG:  redo starts at 1/1252138
2024-11-14 07:50:36.046 UTC [29789] LOG:  unexpected pageaddr 0/E626C000 in log segment 000000010000000100000001, offset 2539520
2024-11-14 07:50:36.047 UTC [29789] LOG:  redo done at 1/126A0E8 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
2024-11-14 07:50:36.396 UTC [29764] LOG:  database system is ready to accept connections

The PGroonga log is empty -- I can only assume it crashes before it would be able to write anything.

kou commented 1 week ago

Thanks. Do you know what Groonga version did you use with PGroonga 3.2.0? I think that SIMD related codes are included in Groonga not PGroonga.

kou commented 1 week ago

simdjson support or llama.cpp support may be related.

khalilsadiq786 commented 1 week ago

Based on system logs, the version of the libgroonga0 package shipped in the PPA as of when this was working correctly was 14.0.3-1.ubuntu22.04.1. That version was installed on 2024-05-26, and then PostgreSQL was restarted on 2024-06-13.

Various libgroonga0 package updates were installed after that, but because PostgreSQL was not restarted after those until a few days ago, I do not believe they would have ever been loaded.

khalilsadiq786 commented 1 week ago

I have just tested further and can confirm the regression is actually in libgroonga 14.1.0 -- with the .deb file from 14.0.9 from the GitHub Actions history it appears to work correctly.

I will file an issue against the groonga repository.