Open Guzya opened 3 years ago
Добрый день! Ну выглядит как проблема в отсутствии history файла на стороне сервера, встает вопрос - легально ли это или нет. А как был образован этот кластер PG? Восстановлен из бэкапа?
Честно говоря, не могу ответить на этот вопрос. Сервер тестовый, с patroni на борту. То что pg_probackup отвалился заметили сегодня. Решил восстановить работу. Поставил на archive pg_prob использую оставшуюся от пред. версии директорию. Удалил средствами pg_prob все архивы и бэкапа. И пытаюсь лить по новой.
history файла уже нет.
Ок, а что делали с сервером? Один раз налили и делали свитчоверы/фейловеры? Из бэкапов кластер восстанавливали?
А в архиве этот history файл есть?
Вообще это повод для Вас задуматься, постгресовый кластер не в 1 таймлайне обязан содержать history файл. Т.е. у Вас что-то пошло не так, и падение бэкапа тут лишь симптом.
С самим постгресом, вроде ни чего не делали, он работает. Как я понимаю еще в январе сервер с архивами pg-archive переустановили. Сервера с постгресом не трогали. Сейчас я увидел, что бэкапы не делаются из-за отсутствия на pg-archive pg_probackup и решил восстановить работу. Но в режиме --stream бэкапы не льются.
00000011.history нашел на реплике. На основном сервере имеется /var/lib/postgresql/11/main/pg_wal/archive_status/00000011.history.done
А бэкап снимаете с мастера или с реплики?
Бэкап снимаю с мастера. Подсунул файл history и бэкап прошел.
Посмотрел информацию в патрони, последние переключение мастера было 14.01 (дата создания 00000011.history на текущей реплике)
promote обязательно создал бы history файл, очень странно
С мастера
[pg-node1-test:] pg_lsclusters Ver Cluster Port Status Owner Data directory Log file 11 main 5432 online postgres /var/lib/postgresql/11/main /var/log/postgresql/postgresql-11-main.log [pg-node1-test:] psql psql (11.6 (Debian 11.6-1.pgdg90+1)) Type "help" for help.
postgres=# select * from pg_replication_slots ; slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn ---------------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+--------------+--------------------- pg_node2_test | | physical | | | f | t | 1932 | | | 5B1/DF14E9C8 | (1 row)
postgres=# select * from pg_replication_slots \gx -[ RECORD 1 ]-------+-------------- slot_name | pg_node2_test plugin | slot_type | physical datoid | database | temporary | f active | t active_pid | 1932 xmin | catalog_xmin | restart_lsn | 5B1/DF14E9C8 confirmed_flush_lsn |
postgres=# select * from pg_stat_replication \gx -[ RECORD 1 ]----+------------------------------ pid | 1932 usesysid | 10 usename | postgres application_name | pg-node2-test client_addr | 172.5.8.9 client_hostname | pg-node2-test.local client_port | 48278 backend_start | 2021-01-14 09:45:41.949374-04 backend_xmin | state | streaming sent_lsn | 5B1/DF14E9C8 write_lsn | 5B1/DF14E9C8 flush_lsn | 5B1/DF14E9C8 replay_lsn | 5B1/DF14E9C8 write_lag | flush_lag | replay_lag | sync_priority | 0 sync_state | async
(дата создания 00000011.history на текущей реплике)
хистори файл нового таймлайна на реплике не самозарождается, он приезжает с мастера ровно тем же способом, что и бэкап пытается его получить. Получается, что он куда-то делся на мастере.
Буду разбираться дальше. Спасибо!
Подождите закрывать, я же еще не пробовал воспроизвести =) Может быть существует какая-то последовательность легальных действий, которая приводит к пропаже history файла
Еще момент, на какое-то время, до начала снятия бэкапа, я выставлял archive_command=true
Т.е. пока я занимался настройкой pg_prob на archive я выставил archive_command=true, чтоб ушли лишние wal-ы и освободилось место.
В постгресовом логе, до выставления в true и после возвращения обратно
2021-02-24 04:03:53.935 AST [8628] WARNING: archiving write-ahead log file "00000011.history" failed too many times, will try again later 2021-02-24 04:04:54.199 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-02-24 04:04:55.396 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-02-24 04:04:56.580 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-02-24 04:04:56.580 AST [8628] WARNING: archiving write-ahead log file "00000011.history" failed too many times, will try again later 2021-02-24 04:05:56 AST [28975]: [1-1]: INFO: pg_probackup archive-push WAL file: 00000011.history, threads: 1/1, batch: 1/1, compression: none 2021-02-24 04:05:56 AST [28975]: [1-1]: LOG: pushing file "00000011.history" 2021-02-24 04:05:56 AST [28975]: [1-1]: ERROR: Cannot open source file "/var/lib/postgresql/11/main/pg_wal/00000011.history": No such file or directory 2021-02-24 04:05:56.868 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172.
Наврал, 2021-02-24 04:38:03.585 AST [1863] LOG: parameter "archive_command" changed to "true"
Это видимо, до того, как поставил pg_pob на archive и после.
Нашел самые ранние упоминания в логах
2021-01-14 09:45:41 AST [1895]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:45:41 AST [1906]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:45:41 AST [1914]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:45:46 AST [2058]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:45:51 AST [2109]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:45:56 AST [2145]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:46:01 AST [2197]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:46:06 AST [2273]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:46:11 AST [2315]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:46:16 AST [2428]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:46:21 AST [2492]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:46:26 AST [2543]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:46:31 AST [2645]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:46:36 AST [2706]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:46:41 AST [2812]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:46:46 AST [2933]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:46:51 AST [2984]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:46:56 AST [3037]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:47:01 AST [3071]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:47:06 AST [3147]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:47:11 AST [3254]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:47:16 AST [3366]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:47:21 AST [3429]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:47:26 AST [3480]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:47:31 AST [3551]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:47:36 AST [3622]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:47:41 AST [3708]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:47:46 AST [3834]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:47:51 AST [3886]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:47:56 AST [3931]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:48:01 AST [4234]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:48:06 AST [4322]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:48:11 AST [4435]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:48:16 AST [4618]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:48:21 AST [4759]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:48:26 AST [4881]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:48:31 AST [4970]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:48:36 AST [5030]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:48:41 AST [5123]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:48:46 AST [5258]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:48:51 AST [5309]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:48:56 AST [5363]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:49:01 AST [5415]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:49:06 AST [5472]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:49:11 AST [5530]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:49:16 AST [5646]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:49:21 AST [5706]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:49:26 AST [5757]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:49:31 AST [5826]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:49:36 AST [5876]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:49:41 AST [5988]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:49:46 AST [6107]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:49:51 AST [6160]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:49:56 AST [6214]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:50:01 AST [6268]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:50:06 AST [6374]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:50:11 AST [7495]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:50:16 AST [7573]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:50:21 AST [7637]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:50:26 AST [7697]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:50:34 AST [7774]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:50:36 AST [7797]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:50:41 AST [7903]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:50:46 AST [8022]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:50:51 AST [8074]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:50:57 AST [8137]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:51:01 AST [8182]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:51:06 AST [8259]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:51:11 AST [8327]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:51:16 AST [8422]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:51:21 AST [8486]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:51:26 AST [8530]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:51:31 AST [8600]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:51:32 AST [8621]: [1-1]: INFO: pg_probackup archive-get WAL file: 00000011.history, remote: ssh, threads: 1/1, batch: 1 2021-01-14 09:51:32.375 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-01-14 09:51:33.404 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-01-14 09:51:34.433 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-01-14 09:51:34.433 AST [8628] WARNING: archiving write-ahead log file "00000011.history" failed too many times, will try again later 2021-01-14 09:52:34.509 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-01-14 09:52:35.550 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-01-14 09:52:36.577 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-01-14 09:52:36.577 AST [8628] WARNING: archiving write-ahead log file "00000011.history" failed too many times, will try again later 2021-01-14 09:53:36.662 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-01-14 09:53:37.688 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-01-14 09:53:38.714 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-01-14 09:53:38.714 AST [8628] WARNING: archiving write-ahead log file "00000011.history" failed too many times, will try again later 2021-01-14 09:54:38.769 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-01-14 09:54:39.801 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-01-14 09:54:40.827 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-01-14 09:54:40.827 AST [8628] WARNING: archiving write-ahead log file "00000011.history" failed too many times, will try again later 2021-01-14 09:55:40.894 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-01-14 09:55:41.921 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-01-14 09:55:42.950 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-01-14 09:55:42.950 AST [8628] WARNING: archiving write-ahead log file "00000011.history" failed too many times, will try again later 2021-01-14 09:56:43.041 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-01-14 09:56:44.064 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-01-14 09:56:45.092 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-01-14 09:56:45.092 AST [8628] WARNING: archiving write-ahead log file "00000011.history" failed too many times, will try again later 2021-01-14 09:57:45.173 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172. 2021-01-14 09:57:46.199 AST [8628] DETAIL: The failed archive command was: /usr/bin/pg_probackup-11 archive-push -B /archive/pg_probackup/ --instance pg-node-test --wal-file-path=pg_wal/00000011.history --wal-file-name=00000011.history --remote-host=172.
Приветствую! Имеются сервера pg-node1-test и pg-archive-test Ранее была установлена версия 2.4.2. После переустановки ос на pg-archive-test произвел установку версии 2.4.10 на оба сервера. При выполнении рк с флагом --stream процесс зависает на моменте досылки wal
INFO: Backup start, pg_probackup version: 2.4.10, instance: pg-node-test, backup ID: QP10QN, backup mode: DELTA, wal mode: STREAM, remote: true, compress-algorithm: none, compress-level: 1 WARNING: This PostgreSQL instance was initialized without data block checksums. pg_probackup have no way to detect data block corruption without them. Reinitialize PGDATA with option '--data-checksums'. WARNING: Backup QP103F has status: RUNNING. Cannot be a parent. INFO: Parent backup: QP0Z9E (null): could not send replication command "TIMELINE_HISTORY": ERROR: could not open file "pg_wal/00000011.history": No such file or directory ERROR: Problem in receivexlog INFO: PGDATA size: 362MB INFO: Start transferring data files INFO: Data files are transferred, time elapsed: 4s INFO: wait for pg_stop_backup() INFO: pg_stop backup() successfully executed INFO: Wait for LSN 5B1/D5000160 in streamed WAL segment /archive/pg_probackup/backups/pg-node-test/QP10QN/database/pg_wal/00000011000005B1000000D5 ERROR: WAL segment 00000011000005B1000000D5 could not be streamed in 300 seconds
На стороне pg-node1-test
2021-02-24 05:23:16 AST [11877]: [1-1]: INFO: pg_probackup archive-push WAL file: 00000011000005B1000000D5, threads: 1/1, batch: 1/1, compression: none 2021-02-24 05:23:16 AST [11877]: [1-1]: LOG: pushing file "00000011000005B1000000D5" 2021-02-24 05:23:16 AST [11877]: [1-1]: VERBOSE: Temp WAL file successfully created: "/archive/pg_probackup/wal/pg-node-test/00000011000005B1000000D5.part" 2021-02-24 05:23:16 AST [11877]: [1-1]: VERBOSE: Rename "/archive/pg_probackup/wal/pg-node-test/00000011000005B1000000D5.part" to "/archive/pg_probackup/wal/pg-node-test/00000011000005B1000000D5" 2021-02-24 05:23:16 AST [11877]: [1-1]: LOG: SSH process 11878 is terminated with status 0 2021-02-24 05:23:16 AST [11877]: [1-1]: INFO: pg_probackup archive-push completed successfully, pushed: 1, skipped: 0, time elapsed: 381ms 2021-02-24 05:23:16 AST [11891]: [1-1]: INFO: pg_probackup archive-push WAL file: 00000011000005B1000000D5.00000028.backup, threads: 1/1, batch: 1/1, compression: none 2021-02-24 05:23:16 AST [11891]: [1-1]: LOG: pushing file "00000011000005B1000000D5.00000028.backup" 2021-02-24 05:23:16 AST [11891]: [1-1]: VERBOSE: Temp WAL file successfully created: "/archive/pg_probackup/wal/pg-node-test/00000011000005B1000000D5.00000028.backup.part" 2021-02-24 05:23:16 AST [11891]: [1-1]: VERBOSE: Rename "/archive/pg_probackup/wal/pg-node-test/00000011000005B1000000D5.00000028.backup.part" to "/archive/pg_probackup/wal/pg-node-test/00000011000005B1000000D5.00000028.backup" 2021-02-24 05:23:16 AST [11891]: [1-1]: LOG: SSH process 11892 is terminated with status 0 2021-02-24 05:23:16 AST [11891]: [1-1]: INFO: pg_probackup archive-push completed successfully, pushed: 1, skipped: 0, time elapsed: 6ms 2021-02-24 05:24:00.000 AST [8629] LOG: cron job 1 starting: select cron.ins_data_session_history() 2021-02-24 05:24:00.109 AST [8629] LOG: cron job 1 completed: 1 row 2021-02-24 05:28:18.134 AST [11709] backup@backupdb LOG: could not receive data from client: Connection reset by peer 2021-02-24 05:28:18.134 AST [11731] backup@[unknown] LOG: could not receive data from client: Connection reset by peer
Без --stream бэкап снимается