timescale / timescaledb-extras

Helper functions and procedures for timescale
Other
42 stars 15 forks source link

migrate_to_hypertable() silently hangs #29

Open chris-morris-h2o opened 1 year ago

chris-morris-h2o commented 1 year ago

Timescale Version: 2.7.2 Postgres Version: 14.5

Issue:

I have a regular table that is ~170 gigs in size. I followed the directions in migrate.md:

CREATE TABLE sqlth_1_data_new (LIKE sqlth_1_data INCLUDING DEFAULTS INCLUDING CONSTRAINTS EXCLUDING INDEXES);

select create_hypertable('sqlth_1_data_new', 't_stamp', chunk_time_interval => 86400000)

I then called

CALL migrate_to_hypertable('sqlth_1_data','sqlth_1_data_new');

Whenever I check the migration log it tends to stop at around ~9records not migrated and ~150,000 records migrated. This is way too small for the 170 gig table. If I stop the call and start it again, it finds the additional records (over 1 million) and begins to migrate them. I also attempted with the multithreading options:

CALL migrate_to_hypertable('sqlth_1_data','sqlth_1_data_new', '4 hours'::interval,'tagid',4,1); CALL migrate_to_hypertable('sqlth_1_data','sqlth_1_data_new', '4 hours'::interval,'tagid',4,2); CALL migrate_to_hypertable('sqlth_1_data','sqlth_1_data_new', '4 hours'::interval,'tagid',4,3); CALL migrate_to_hypertable('sqlth_1_data','sqlth_1_data_new', '4 hours'::interval,'tagid',4,4);

This would migrate the same amount and hang again around 39 records left. If I stopped it and restarted it it would not find additional records unless I changed the interval. I have tried different intervals; 1, 4, 12, and 24 hour intervals. They don't stop at the same unmigrated records count. The multithreading option never seems to find additional records unlike the single threaded option.

If I call the single thread option, let it hang, stop it, call it a second time, let it run for a bit to find the unmigrated records, then stop it, and finally start the multithreaded option then the multithreaded option will start migrating the unmigrated records that the single thread initially found. However, at some point in the process (and it is different every time) the multithreaded stops working again and interrupting it and restarting it does not cause it to continue. If I call the singlethread option it will begin working again, but killing it and switching back to the multithreaded option does not cause the migration to continue.

I checked pg_stat_activity to see if I could see anything but don't really see anything that stands out. I'm not sure what the problem could be. Any ideas?

datid datname pid leader_pid usesysid usename application_name client_addr client_hostname client_port backend_start xact_start query_start state_change wait_event_type wait_event state backend_xid backend_xmin query_id query backend_type
2.82E+08 eftestrestore 54991 10 postgres TimescaleDB Background Worker Scheduler 2022-09-13 15:01:06.245292+00 2022-09-15 13:23:42.446979+00 Extension Extension idle TimescaleDB Background Worker Scheduler
2.82E+08 eftestrestore 361230 10 postgres psql -1 2022-09-15 12:14:24.759636+00 2022-09-15 13:22:31.299894+00 2022-09-15 13:22:31.29086+00 2022-09-15 13:22:31.290863+00 active 18700543 18700543 CALL migrate_to_hypertable('sqlth_1_data','sqlth_1_data_new', '12 hours'::interval,'tagid',4,1); client backend
2.82E+08 eftestrestore 361496 16388 wbsadmin psql -1 2022-09-15 12:15:56.418579+00 2022-09-15 13:22:32.458727+00 2022-09-15 13:22:32.450007+00 2022-09-15 13:22:32.450008+00 Lock relation active 18700547 18700543 CALL migrate_to_hypertable('sqlth_1_data','sqlth_1_data_new', '12 hours'::interval,'tagid',4,2); client backend
2.82E+08 eftestrestore 370071 16388 wbsadmin psql -1 2022-09-15 13:20:10.451209+00 2022-09-15 13:22:33.644771+00 2022-09-15 13:22:33.634036+00 2022-09-15 13:22:33.63404+00 Lock relation active 18700550 18700543 CALL migrate_to_hypertable('sqlth_1_data','sqlth_1_data_new', '12 hours'::interval,'tagid',4,3); client backend
2.82E+08 eftestrestore 361688 16388 wbsadmin psql -1 2022-09-15 12:16:33.271994+00 2022-09-15 13:22:34.754784+00 2022-09-15 13:22:34.744147+00 2022-09-15 13:22:34.744151+00 Lock relation active 18700553 18700543 CALL migrate_to_hypertable('sqlth_1_data','sqlth_1_data_new', '12 hours'::interval,'tagid',4,4); client backend
2.82E+08 eftestrestore 361921 3122465 ignhist pgAdmin 4 - DB:eftestrestore 172.30.30.2 51223 2022-09-15 12:18:41.512397+00 2022-09-15 13:06:19.755473+00 2022-09-15 13:06:19.756157+00 Client ClientRead idle SELECT rel.oid, rel.relname AS name,

(SELECT count(*) FROM pg_catalog.pg_trigger WHERE tgrelid=rel.oid AND tgisinternal = FALSE) AS triggercount,

(SELECT count(*) FROM pg_catalog.pg_trigger WHERE tgrelid=rel.oid AND tgisinternal = FALSE AND tgenabled = 'O') AS has_enable_triggers,

(CASE WHEN rel.relkind = 'p' THEN true ELSE false END) AS is_partitioned,

(SELECT count(1) FROM pg_catalog.pg_inherits WHERE inhrelid=rel.oid LIMIT 1) as is_inherits,

(SELECT count(1) FROM pg_catalog.pg_inherits WHERE inhparent=rel.oid LIMIT 1) as is_inherited

FROM pg_catalog.pg_class rel

WHERE rel.relkind IN ('r','s','t','p') AND rel.relnamespace = 2200::oid

AND NOT rel.relispartition

ORDER BY rel.relname;
client backend
2.82E+08 eftestrestore 361963 3122465 ignhist pgAdmin 4 - CONN:7595495 172.30.30.2 51312 2022-09-15 12:19:04.856964+00 2022-09-15 13:23:46.314539+00 2022-09-15 13:23:46.314539+00 2022-09-15 13:23:46.314541+00 active 18700543 select * from pg_stat_activity where datname = 'eftestrestore' client backend
chris-morris-h2o commented 1 year ago

I changed my interval to 5 minutes and the process has been running without hanging up for quite awhile and has found 4,271,354 unmigrated records now.

Is there a way to attempt to calculate the optimal interval for processing?

haydenflinner commented 1 year ago

@chris-morris-h2o did you find an answer to this? This is something I'm wondering about for this migrate.sql and also for general hypertable chunk intervals; it would be nice there were a way to estimate at runtime what would be a good batch or chunk size, as my input data can vary dramatically in density. I'm thinking of using this migrate_to_hypertable in combination with staging tables for the process of backfilling. Rather than hammering the main hypertable with naive inserts from already sorted staging tables.