supabase / pg_net

A PostgreSQL extension that enables asynchronous (non-blocking) HTTP/HTTPS requests with SQL
https://supabase.github.io/pg_net
Apache License 2.0
213 stars 16 forks source link

UNLOGGED table segfault causing whole cluster to restart #132

Closed steve-chavez closed 2 months ago

steve-chavez commented 2 months ago

Problem

In some rare cases, pg_net reports:

"background worker ""pg_net 0.8.0 worker"" (PID 132605) was terminated by signal 11: Segmentation fault",,,,,,,,,"","postmaster",,0
"terminating any other active server processes",,,,,,,,,"","postmaster",,0
"all server processes terminated; reinitializing",,,,,,,,,"","postmaster",,0
"database system was interrupted; last known up at 2024-06-03 02:15:10 UTC",,,,,,,,,"","startup",,0
"database system was not properly shut down; automatic recovery in progress",,,,,,,,,"","startup",,0
"redo starts at 0/6154CA0",,,,,,,,,"","startup",,0

The thing is that a segmentation fault in pg_net should only kill its worker but somehow this restarts the whole cluster.

This thread https://www.postgresql.org/message-id/flat/871rucdwkt.fsf%40alfa.kjonca indicates that this could be an UNLOGGED table bug.

So far I haven't been able to reproduce this, some Supabase tickets indicate that this can happen sometimes when invoking edge functions.

Solution

https://github.com/supabase/pg_net/issues/62 is a possible solution. Since we would not need the UNLOGGED table for the queue.

steve-chavez commented 2 months ago

Turns out it's possible for a segfault in a background worker to crash the whole cluster, see https://www.postgresql.org/message-id/flat/16233-d3f4c2bf4d76018e%40postgresql.org.

I'll close this by explicitly handling all cases where a NULL might be obtained in the worker code.