supabase / pg_net

A PostgreSQL extension that enables asynchronous (non-blocking) HTTP/HTTPS requests with SQL
https://supabase.github.io/pg_net
Apache License 2.0
213 stars 16 forks source link

Self Hosted / Docker `supabase/postgres:15.1.0.109_arm64` : Worker does not execute requests #98

Closed bentron2000 closed 1 year ago

bentron2000 commented 1 year ago

Describe the bug

Docker install of supabase/postgres:15.1.0.109_arm64 pg_net installs successfully, creates net schema etc, and functions seemingly operate as expected, however requests do not execute. They do not time out either.

To Reproduce

  1. install postgres docker image supabase/postgres:15.1.0.109_arm64
  2. create a database
  3. execute create extension pg_net;
  4. Send an example query as per readme:
    SELECT net.http_get (
    'https://postman-echo.com/get?foo1=bar1&foo2=bar2'
    ) AS request_id;
  5. query: SELECT * FROM net.http_request_queue => responds with one row corresponding to the request
  6. query: SELECT * FROM net._http_response => responds with no rows
  7. SELECT net.check_worker_is_up() reports: ERROR: the pg_net background worker is not up
  8. manually querying SELECT * from pg_stat_activity shows a worker is running - but the backend_type from this query reports "pg_net 0.7.1 worker" - where the query in the check_worker_is_up() function expects "pg_net worker"

At this point the request stays in net.http_request_queue and never times out etc...

NB. A test hosted on supabase sends requests successfully and populates net._http_response with a response, but also shows ERROR: the pg_net background worker is not up when check_worker_is_up() is executed.

Expected behavior

The request should execute, the row in the request table should no longer be present and a response should be in the response table.

Screenshots

Screenshot 2023-08-07 at 12 41 09 pm Screenshot 2023-08-07 at 12 40 40 pm Screenshot 2023-08-07 at 12 38 31 pm Screenshot 2023-08-07 at 12 38 00 pm

System information

TheOtherBrian1 commented 1 year ago

I looked into issue regarding the inaccurate error messages.

In September of last year, a commit was made to worker.c that incorporated the version number into the return message, i.e. the message went from "pg_net worker" to "pg_net VERSION worker". The net.check_worker_is_up function was not updated to reflect this change.

This issue can be resolved in 3 ways:

  1. Remove the version number from the worker.c return message
  2. Hard code the return number into the net.check_worker_is_up function
  3. Change the comparison in the net.check_worker_is_up function to be general purpose for all new versions.

I prefer the third option because it would prevent this type of issue in the future. I took the liberty of rewriting the offending line:

Old Comparison:

... where backend_type = 'pg_net worker') ...

New Comparison:

... where backend_type ~ 'pg_net \d+(\.\d+){0,2} worker')

As for why your requests are idling, I'm not sure. That's a more complex issue that is not something I can resolve, at least not quickly.

I suspect your specific issue may have to do with your Docker configuration and its permission to access the internet. It might be helpful to monitor network traffic in your container using a terminal tool, such as nload, to see if the extension is making any requests.

bentron2000 commented 1 year ago

Thanks @TheOtherBrian1

I should have thought the timeout would still function in the case that network conditions/config were the issue here. At the least, some sort of error messaging 🤔

I have configured the docker container to have internet access (as far as I can tell - i.e. I can curl things from a bash command line within the docker container)

Unsure where to turn now :/

Thanks again

TheOtherBrian1 commented 1 year ago

@bentron2000

I spent some time looking through the extensions code, and it seems that many of the errors are directly written to the PostgreSQL log file. On your set up, it may be located in the var/log/postgresql directory. You should try viewing it if you haven't. If does show the cause of the error, I would be very curious to know what it is.

bentron2000 commented 1 year ago

@TheOtherBrian1

Thanks for looking into this. You've pointed me in a few directions that really helped me today.

It appears as though the devil is in the detail here: In my repro instructions I put 'create a database' - this is the key issue.

pg_net does not work (or is not configured to work) on any database other than the postgres database that is present by default in this docker image.

When I install the extension and add a request in the existing postgres database - I have no issues with pg_net - (other than the previously noted bug in check_worker_is_up())

Curiously though - there are no logs where you indicated.

Maintainers: The documentation should either:

  1. Note that pg_net only works with the specific config present in the existing 'postgres' database (and why), or,
  2. Give instructions on how to configure it for an arbitrary database.
TheOtherBrian1 commented 1 year ago

@bentron2000

Wow, you are right, it is by default setup only to work on the postgres database, as coded in Line 41 in worker.c

char* database_name = "postgres";

Unfortunately, there were no hints in the documentation alluding to extension configuration. The documentation most certainly should be updated.

In the _PG_Init function in worker.c, there is the following code:

DefineCustomStringVariable(
      "pg_net.database_name", 
       "database where the pg_net worker is connected",
        NULL,
        &database_name,
        "postgres",
        PGC_SIGHUP, 0,
        NULL, NULL, NULL);
}

DefineCustomStringVariable comes from the GUC.c library in PostgreSQL, which stands for Grand Unified Configuration. This code makes it possible to set the _databasename variable in the postgres.conf file.

Looking further into the code, it seems that the extension creates 3 configurable variables:

  1. pg_net.batch_size (default: 200): An integer that limits the max number of rows that the extension will process from _net.http_requestqueue during each read
  2. pg_net.ttl (default: 6 hours): An interval that defines the max time a row in the _net._httpresponse will live before being deleted
  3. pg_net.database_name (default: 'postgres'): A string that defines which database the extension is applied to

All these variables can be viewed with the following command:

select * from pg_settings WHERE name LIKE 'pg_net%'

The postgres.conf, file can be found with the following SQL command:

SHOW config_file;

I'm not too familiar with modifying the postgres.conf, so I Google searched for some hints on how to handle it properly.

I found a supabase post called Protecting reserved roles with PostgreSQL Hooks that included a section about modifying configuration variables for a "supautils.c" extension:

postgresql.conf

# set the extension as preloaded, this will require a restart
shared_preload_libraries="supautils"

# the reserved roles
supautils.reserved_roles="supabase_storage_admin, supabase_auth_admin"

Extrapolating from that example, my guess is that you could modify your postgresql.conf file with the following line(s) to get _pgnet to work:

#The below line may be necessary
#shared_preload_libraries="pg_net" 

pg_net.database_name=<"Your_new_database">

You may also be able to change the variable in SQL by using an ALTER SYSTEM query, as outlined by this stackoverflow post

After saving the file, you can execute select pg_reload_conf() to update postgres.conf for your database. If the extension does not respond to the update, it may be necessary to restart your entire server.

If this solves your issue, I'll add some configuration guidelines to the readme.md for future readers.

bentron2000 commented 1 year ago

@TheOtherBrian1

Amazing! That certainly looks like the information required to get this working.

I made a quick attempt to use ALTER SYSTEM to modify the config but the postgres user in the image does not have sufficient permissions for this purpose.

However, adding pg_net.database_name=<"Your_new_database"> to the postgresql.conf file does work - and pg_net operates correctly once this is set.

This is the solution :) Really appreciate the time you've spent assisting me.

To summarize: When using pg_net with a database named other than postgres - the variable pg_net.database_name must be set to the name of the database the worker should attach to. This may be able to be done at runtime - however, is best achieved via postgresql.conf. NB. This suggests that only one instance of pg_net can be enabled per postgres instance.

Thanks again :) B

soedirgo commented 1 year ago

Thanks all, and sorry for the confusion! I'm updating the README here to clarify on this.