timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
17.78k stars 886 forks source link

[Bug]: pg_dump generates truncate only for hypertables which fails during pg_restore #7137

Open toutas opened 3 months ago

toutas commented 3 months ago

What type of bug is this?

Unexpected error

What subsystems and features are affected?

Backup, Partitioning, Restore

What happened?

When I pg_dump my selfhosted database that contains many tables along with one single hypertable which was generated using

SELECT create_hypertable('satelliteproductssatellitezones', 'image_date', chunk_time_interval => INTERVAL '3 months', migrate_data => true);

with the following pg_dump command

pg_dump -Fc -Z zstd:9 -h localhost -U postgres {{ database_name }} > {{ dump_path }}/{{ backup_dump_name }}'

the generated dump throws errors during restore with pg_restore regarding the ONLY keyword. Here is an excerpt:

pg_restore -j 4 --clean --create --if-exists --format=c -U postgres -d postgres {{ container_volume_path }}/{{ dump_file }}
pg_restore: error: could not execute query: ERROR:  cannot truncate only a hypertable
HINT:  Do not specify the ONLY keyword, or use truncate only on the chunks directly.
Command was: TRUNCATE TABLE ONLY public.satelliteproductssatellitezones;

pg_restore: error: could not execute query: ERROR:  current transaction is aborted, commands ignored until end of transaction block
Command was: COPY public.satelliteproductssatellitezones (id, satelliteproduct_id, satellitezone_id, predictiontasks_processstatus_id, trainingdatatasks_processstatus_id, image_date) FROM stdin;
pg_restore: error: could not execute query: ERROR:  ONLY option not supported on hypertable operations
Command was: ALTER TABLE ONLY public.satelliteproductssatellitezones
    ADD CONSTRAINT satelliteproductssatellitezones_pkey PRIMARY KEY (id, image_date);

I have been looking around for documentation to determine if I am doing something wrong, or if this way of performing backup/restore is just not supported? The backup and restore docs mention using pg_dump and pg_restore, so I am wondering why this is not working or does not have any documentation to point out backups should not be done this way with timescaledb.

At this point I assume the best practice would be to generate directory dump and use sed to remove all occurences of only, but I hope if that is the recommended approach it will be documented somewhere as it seems like a bug that pg_dump/pg_restore is recommended but not quite working.

TimescaleDB version affected

2.15.3

PostgreSQL version used

PostgreSQL 16.3 (Ubuntu 16.3-1.pgdg22.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit

What operating system did you use?

Ubuntu 20.04.6 LTS (GNU/Linux 5.4.0-189-generic x86_64)

What installation method did you use?

Docker

What platform did you run on?

On prem/Self-hosted, Other

Relevant log output and stack trace

pg_restore: error: could not execute query: ERROR:  cannot truncate only a hypertable
HINT:  Do not specify the ONLY keyword, or use truncate only on the chunks directly.
Command was: TRUNCATE TABLE ONLY public.satelliteproductssatellitezones;

pg_restore: error: could not execute query: ERROR:  current transaction is aborted, commands ignored until end of transaction block
Command was: COPY public.satelliteproductssatellitezones (id, satelliteproduct_id, satellitezone_id, predictiontasks_processstatus_id, trainingdatatasks_processstatus_id, image_date) FROM stdin;
pg_restore: error: could not execute query: ERROR:  ONLY option not supported on hypertable operations
Command was: ALTER TABLE ONLY public.satelliteproductssatellitezones
    ADD CONSTRAINT satelliteproductssatellitezones_pkey PRIMARY KEY (id, image_date);

pg_restore: error: could not execute query: ERROR:  duplicate key value violates unique constraint "chunk_index_chunk_id_index_name_key"
DETAIL:  Key (chunk_id, index_name)=(10, _hyper_1_10_chunk_satelliteproductssatellitezones_sz_id_sp_id_i) already exists.
Command was: CREATE UNIQUE INDEX satelliteproductssatellitezones_sz_id_sp_id_image_date_ak ON public.satelliteproductssatellitezones USING btree (satellitezone_id, satelliteproduct_id, image_date);

pg_restore: error: could not execute query: ERROR:  duplicate key value violates unique constraint "chunk_index_chunk_id_index_name_key"
DETAIL:  Key (chunk_id, index_name)=(10, _hyper_1_10_chunk_spsz_pt_unassigned_idx) already exists.
Command was: CREATE INDEX spsz_pt_unassigned_idx ON public.satelliteproductssatellitezones USING btree (satelliteproduct_id) WHERE (predictiontasks_processstatus_id = 1);

pg_restore: error: could not execute query: ERROR:  duplicate key value violates unique constraint "chunk_index_chunk_id_index_name_key"
DETAIL:  Key (chunk_id, index_name)=(10, _hyper_1_10_chunk_spsz_satelliteproduct_id_idx) already exists.
Command was: CREATE INDEX spsz_satelliteproduct_id_idx ON public.satelliteproductssatellitezones USING btree (satelliteproduct_id);

pg_restore: error: could not execute query: ERROR:  duplicate key value violates unique constraint "chunk_index_chunk_id_index_name_key"
DETAIL:  Key (chunk_id, index_name)=(10, _hyper_1_10_chunk_spsz_tdt_unassigned_idx) already exists.
Command was: CREATE INDEX spsz_tdt_unassigned_idx ON public.satelliteproductssatellitezones USING btree (satelliteproduct_id) WHERE (trainingdatatasks_processstatus_id = 1);

pg_restore: error: could not execute query: ERROR:  ONLY option not supported on hypertable operations
Command was: ALTER TABLE ONLY public.satelliteproductssatellitezones
    ADD CONSTRAINT fk_satelliteproductssatellitezones_satellitezones FOREIGN KEY (satellitezone_id) REFERENCES public.satellitezones(id);

pg_restore: error: COPY failed for table "_hyper_1_1_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_1_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_1_chunk, line 1: "764910124 37300259    754043  6   6   2021-07-11 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_2_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_2_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_2_chunk, line 1: "765069017 37309004    728808  6   6   2024-06-06 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_3_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_3_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_3_chunk, line 1: "803252402 39414181    751270  6   6   2022-12-03 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_4_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_4_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_4_chunk, line 1: "772474200 37695143    786935  6   6   2021-09-30 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_5_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_5_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_5_chunk, line 1: "777159202 37967693    764891  6   6   2021-12-29 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_6_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_6_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_6_chunk, line 1: "782769870 38300710    800159  6   6   2022-03-29 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_7_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_7_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_7_chunk, line 1: "790918251 38733120    884539  6   6   2022-06-27 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_8_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_8_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_8_chunk, line 1: "804128042 39467932    648997  6   6   2022-12-24 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_9_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_9_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_9_chunk, line 1: "807402378 39663087    775768  6   6   2023-04-01 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_10_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_10_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_10_chunk, line 1: "814428871    40013632    649070  6   6   2023-06-22 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_11_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_11_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_11_chunk, line 1: "822184020    40405139    648889  6   6   2023-09-20 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_12_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_12_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_12_chunk, line 1: "408929331    17816730    636918  9   6   2023-12-28 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_13_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_13_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_13_chunk, line 1: "442304389    20154460    703557  9   9   2018-11-16 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_14_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_14_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_14_chunk, line 1: "834790322    41121508    673109  6   6   2019-02-23 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_16_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_16_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_16_chunk, line 1: "834965663    41131615    674425  6   6   2020-12-24 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_15_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_15_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_15_chunk, line 1: "834856720    41125418    751642  6   6   2020-04-26 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_17_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_17_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_17_chunk, line 1: "461625897    21188149    698261  6   9   2019-05-08 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_18_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_18_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_18_chunk, line 1: "476398082    21866066    698261  6   9   2019-08-12 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_19_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_19_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_19_chunk, line 1: "520108156    24097956    698261  6   9   2020-07-13 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_20_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_20_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_20_chunk, line 1: "561640684    26247463    698261  6   9   2021-06-02 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_21_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_21_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_21_chunk, line 1: "490091918    22573414    700919  6   9   2019-11-28 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_22_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_22_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_22_chunk, line 1: "504253309    23363236    700919  6   9   2020-04-03 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_23_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_23_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_23_chunk, line 1: "551122095    25740410    700919  6   9   2021-03-24 00:00:00"
pg_restore: error: COPY failed for table "_hyper_1_24_chunk": ERROR:  invalid INSERT on the root table of hypertable "_hyper_1_24_chunk"
HINT:  Make sure the TimescaleDB extension has been preloaded.
CONTEXT:  COPY _hyper_1_24_chunk, line 1: "839542819    41298728    767516  6   6   2024-06-16 00:00:00"
pg_restore: error: could not execute query: ERROR:  ONLY option not supported on hypertable operations
Command was: ALTER TABLE ONLY public.satelliteproductssatellitezones
    ADD CONSTRAINT fk_satelliteproductssatellitezones_satelliteproducts FOREIGN KEY (satelliteproduct_id) REFERENCES public.satelliteproducts(id);

How can we reproduce the bug?

I have not attempted to reproduce it from scratch in an empty database, but I can reproduce the bug from my existing database not containing any hypertables. I am using the docker image timescale/timescaledb-ha:pg16.3-ts2.15.3.

  1. converting an existing table to a hypertable using the command

    SELECT create_hypertable('tablename', 'timestampcolumn', chunk_time_interval => INTERVAL '3 months', migrate_data => true);
  2. dump the complete database containing the hypertable with the following pg_dump flags

    docker exec postgrescontainer bash -c 'pg_dump -Fc -Z zstd:9 -h localhost -U postgres databasename > /var/lib/postgresql/data/backup.dump'
  3. on a new fresh postgres container with the backup.dump file present in the data directory run

    docker exec freshpostgrescontainer  bash -c 'pg_restore -j 4 --clean --create --if-exists --format=c -U postgres -d postgres /var/lib/postgresql/data/backup.dump'

the errors are then generated

fabriziomello commented 3 months ago

@toutas when restoring from a dump you should put the instance in the restoring mode. Have a look the documentation: https://docs.timescale.com/migrate/latest/pg-dump-and-restore/

toutas commented 3 months ago

@toutas when restoring from a dump you should put the instance in the restoring mode. Have a look the documentation: https://docs.timescale.com/migrate/latest/pg-dump-and-restore/

I did end up doing that, but it makes no difference.

I fixed it by running pg_restore -f - to pipe the output to sed and remove all ONLY occurences for hypertables sed "s/ONLY {{ hypertable }}/{{ hypertable }}/g"

still think it should be documented that this is necessary

fabriziomello commented 3 months ago

@toutas when restoring from a dump you should put the instance in the restoring mode. Have a look the documentation: https://docs.timescale.com/migrate/latest/pg-dump-and-restore/

I did end up doing that, but it makes no difference.

I fixed it by running pg_restore -f - to pipe the output to sed and remove all ONLY occurences for hypertables sed "s/ONLY {{ hypertable }}/{{ hypertable }}/g"

still think it should be documented that this is necessary

The safe and correct way (as the document link I've sent before) is do something like:

  1. Put your database into the restoring mode:
    psql -d your_database -c 'SELECT timescaledb_pre_restore();'
  2. Run the restore:
    pg_restore ...
  3. Put back your database into the normal mode:
    psql -d your_database -c 'SELECT timescaledb_post_restore();'
fabriziomello commented 3 months ago

@toutas you need to pay attention to run the "pre" and "post" restore commands into the same database you'll restore your database.

toutas commented 3 months ago

@toutas when restoring from a dump you should put the instance in the restoring mode. Have a look the documentation: https://docs.timescale.com/migrate/latest/pg-dump-and-restore/

I did end up doing that, but it makes no difference. I fixed it by running pg_restore -f - to pipe the output to sed and remove all ONLY occurences for hypertables sed "s/ONLY {{ hypertable }}/{{ hypertable }}/g" still think it should be documented that this is necessary

The safe and correct way (as the document link I've sent before) is do something like:

1. Put your database into the restoring mode:
psql -d your_database -c 'SELECT timescaledb_pre_restore();'
2. Run the restore:
pg_restore ...
3. Put back your database into the normal mode:
psql -d your_database -c 'SELECT timescaledb_post_restore();'

I am well aware of the safe and correct way as I mentioned, I am already doing it. It does not fix the issue at hand though, so not sure why you are pointing it out?

image

This is the command I am running when restoring. without sed to remove ONLY keyword from DDL touching hypertables it does not work. If I remove that part it fails restoring with the errors mentioned in my main issue post.

@fabriziomello

so what you are saying is I need to run it not on the initial postgres database, but create my target database before restoring?

fabriziomello commented 3 months ago

so what you are saying is I need to run it not on the initial postgres database, but create my target database before restoring?

Exactly... because this "timescaledb_pre_restore()" will disable timescaledb hooks in order to don't raise exceptions on the statements that are failing during the restore. In a normal operation those statements are blocked by timescaledb.

toutas commented 3 months ago

so what you are saying is I need to run it not on the initial postgres database, but create my target database before restoring?

Exactly... because this "timescaledb_pre_restore()" will disable timescaledb hooks in order to don't raise exceptions on the statements that are failing during the restore. In a normal operation those statements are blocked by timescaledb.

Fair, I will try it out and confirm it fixes the issue.

Normally pg_dump and pg_restore do not require the target database to be present, so would be nice with a notice in the documentation to people using the --create flag like I am for restoring to fresh instances.

fabriziomello commented 3 months ago

Normally pg_dump and pg_restore do not require the target database to be present, so would be nice with a notice in the documentation to people using the --create flag like I am for restoring to fresh instances.

I know but unfortunately the pg_dump options --create and --clean will not work properly with timescaledb... this is another advice for you to don't use it and manually create the new target database.