zalando / spilo

Highly available elephant herd: HA PostgreSQL cluster using Docker
Apache License 2.0
1.51k stars 371 forks source link

pg_cron fails with returned message `connection failed` #959

Closed ETisREAL closed 4 months ago

ETisREAL commented 6 months ago

Hi :) hope everything's alright.

I am trying to set up a cron using the pg_cron extension. I have checked an the image has set the pg_hba trust for local connections:

pg_hba:
  - local   all             all                                   trust

and it is using a unix socket for connecitons:

use_unix_socket: true

I am creating a function that the pg_cron should run, which I'll share for reference as well:

CREATE OR REPLACE FUNCTION manage_autoscaling_metrics() RETURNS VOID AS $$
    DECLARE
        conn_in_use FLOAT;
    BEGIN
        SELECT sum(numbackends) FROM pg_stat_database INTO conn_in_use;

        INSERT INTO autoscaling_metrics (connection_in_use) VALUES (conn_in_use);

        DELETE FROM autoscaling_metrics WHERE timestamp < NOW() - INTERVAL '5 minutes';
    END;
    $$ LANGUAGE plpgsql;

and setting up the cron job in the postgresql DB


SELECT cron.schedule('*/5 * * * *', 'SELECT manage_autoscaling_metrics();');

The function's code is correct, because if I launch the same query manually ('SELECT manage_autoscaling_metrics();') a row gets inserted as expected. However when I check for the status of the cron I get this output:

select * from cron.job_run_details order by start_time desc limit 5;
 jobid | runid | job_pid | database | username |               command                | status |  return_message   | start_time | end_time 
-------+-------+---------+----------+----------+--------------------------------------+--------+-------------------+------------+----------
     1 |     1 |         | postgres | zalando  | SELECT manage_autoscaling_metrics(); | failed | connection failed |            | 

I really have no idea what I'm doing wrong. As I said, I checked the configurations of the container and everything seems fine. Is it a database issue perhaps? But from what I've read the cron is supposed to run in the postgres DB isn't it?

If you need any more context I'll gladly provide it.

Thanks in advance

hughcapet commented 4 months ago

wondering why you are asking questions about pg_cron usage in the Spilo repo:)

maybe this can help in your situation https://github.com/citusdata/pg_cron/issues/48#issuecomment-386207327

ETisREAL commented 4 months ago

@hughcapet I am asking here because I believe there is a misconfiguration in the Spilo image, since I checked the pgconf in the container and it should allow trusted local connections, yet pg_cron somehow fails.