[Bug]: Unable to change job owner

jflambert commented 4 months ago

What type of bug is this?

Unexpected error

What subsystems and features are affected?

User-Defined Action (UDA)

What happened?

I need to be able to change ownership of timescaledb background jobs.

Normally I use REASSIGN OWNED BY X TO Y; in order to reassign all objects from role X to role Y, and then drop role Y.

Timescaledb jobs aren't handled by this operation, so I needed to do something like this: UPDATE _timescaledb_config.bgw_job SET owner='Y'; in order for DROP ROLE to work without complaining.

However, the jobs still seem to execute with the older owner.

However if I manually use select run_job(1000); I don't have an issue. So it's most likely from the scheduler side caching the OID that was used at creation time?

TimescaleDB version affected

2.14.2

PostgreSQL version used

16.2

What operating system did you use?

timescaledb-ha

What installation method did you use?

Docker

What platform did you run on?

On prem/Self-hosted

Relevant log output and stack trace

No response

How can we reproduce the bug?

- create database D and owner D
- create role X superuser
- (as X) select add_job('xxx', '5s')
- reassign owned by X to D
- UPDATE _timescaledb_config.bgw_job SET owner='D';
- DROP USER X

and you should get errors in the logs every 5 seconds

mkindahl commented 4 months ago

@jflambert Thank you for the bug report.

I see two problems here:

The cache is not updated.
Support for handling role renames were added in https://github.com/timescale/timescaledb/pull/5558 but not support for REASSIGN, so this should be fixed.

jflambert commented 4 months ago

Do you have more info about 1 for me? I have another related problem, that I'd like to log but I want better replication steps.

A similar issue arises when I temporarily alter user X with SUPERUSER, create some jobs, and then alter user X with NOSUPERUSER. This will bypass the REASSIGN, but now the problem I have is that the jobs don't execute (via scheduler) until I restart the postgres process.

EDIT: Ah, I think I know what I did wrong. Before applying migrations, I would kill all backend processes with this:

  PERFORM pg_terminate_backend(pid)
  FROM pg_stat_activity
  WHERE datname = 'xxx'
  AND pid <> pg_backend_pid();

Unfortunately this also includes the timescaledb background worker. So I added AND wait_event_type = 'Client' to allow it to live during migrations.

mkindahl commented 4 months ago

@jflambert #6987 should fix issue 2.

timescale / timescaledb