Altinity / clickhouse-backup

Tool for easy backup and restore for ClickHouse® using object storage for backup files.
https://altinity.com
Other
1.25k stars 225 forks source link

how to setting the backup data stored in a custom directory #272

Closed sixinyiyu closed 2 years ago

sixinyiyu commented 3 years ago

version info

Version:     1.0.0
Git Commit:  37f3bd78adec2aadc1de10f9323fe426a5e12dc4
Build Date:  2021-06-16

system.disks

┌─name────┬─path─────────┬───free_space─┬──total_space─┬─keep_free_space─┬─type──┐ │ default │ /data01/ckk/ │ 147627498496 │ 422621649920 │ 1024 │ local │ │ disk1 │ /data02/ckk/ │ 104120437760 │ 422621649920 │ 1024 │ local │ └─────────┴──────────────┴──────────────┴──────────────┴─────────────────┴───────┘

config.yml

disk_mapping: {"disk1":"/home/aaa/", "default":"/home/bbb/"}

the result :

the directoryof   /data02/ckk/shadow and  /data01/ckk/shadow 
Contains the latest shadow  files

the directoryof   //home/bbb/
metadata  metadata.json

the directoryof   //home/aaa/
nothing

the log

021/09/22 16:54:02 debug SELECT name, engine FROM system.databases WHERE name != 'system'

2021/09/22 16:54:02 debug SELECT count() FROM system.settings WHERE name = 'show_table_uuid_in_table_create_query_if_not_nil'
2021/09/22 16:54:02 debug SELECT * FROM system.tables WHERE is_temporary = 0 SETTINGS show_table_uuid_in_table_create_query_if_not_nil=1
2021/09/22 16:54:12 debug SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER'
2021/09/22 16:54:12 debug SELECT * FROM system.disks;
2021/09/22 16:54:12 debug SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER'
2021/09/22 16:54:12 debug SELECT * FROM system.disks;
2021/09/22 16:54:12 debug SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER'
2021/09/22 16:54:12 debug SELECT * FROM system.disks;
2021/09/22 16:54:12 debug create data               backup=FFF operation=create table=test.dwd_knowledge_nox_vuln_rating_test_local
2021/09/22 16:54:12 debug SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER'
2021/09/22 16:54:12 debug SELECT * FROM system.disks;
2021/09/22 16:54:12 debug SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER'
2021/09/22 16:54:12 debug SYSTEM SYNC REPLICA `test`.`dwd_knowledge_nox_vuln_rating_test_local`;
2021/09/22 16:54:12 debug replica synced            table=test.dwd_knowledge_nox_vuln_rating_test_local
2021/09/22 16:54:12 debug ALTER TABLE `test`.`dwd_knowledge_nox_vuln_rating_test_local` FREEZE WITH NAME '50be6aee6e154ae19a60d086520dbd9b';
2021/09/22 16:54:17 debug freezed                   backup=FFF operation=create table=test.dwd_knowledge_nox_vuln_rating_test_local
2021/09/22 16:54:17 debug SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER'
2021/09/22 16:54:17 debug SELECT * FROM system.disks;
2021/09/22 16:54:17 debug done                      backup=FFF operation=create table=test.dwd_knowledge_nox_vuln_rating_test_local
2021/09/22 16:54:17 debug create metadata           backup=FFF operation=create table=test.dwd_knowledge_nox_vuln_rating_test_local
2021/09/22 16:54:17  info done                      backup=FFF operation=create table=test.dwd_knowledge_nox_vuln_rating_test_local
2021/09/22 16:54:17 debug create data               backup=FFF operation=create table=xuanji_dwd.dwd_dayu_log_skylar_virus_local
2021/09/22 16:54:17 debug SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER'
2021/09/22 16:54:17 debug SELECT * FROM system.disks;
2021/09/22 16:54:17 debug SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER'
2021/09/22 16:54:17 debug SYSTEM SYNC REPLICA `xuanji_dwd`.`dwd_dayu_log_skylar_virus_local`;
2021/09/22 16:54:17 debug replica synced            table=xuanji_dwd.dwd_dayu_log_skylar_virus_local
2021/09/22 16:54:17 debug ALTER TABLE `xuanji_dwd`.`dwd_dayu_log_skylar_virus_local` FREEZE WITH NAME '54f64c74845f4cbd86f6e96f32d92307';
2021/09/22 16:54:17 debug freezed                   backup=FFF operation=create table=xuanji_dwd.dwd_dayu_log_skylar_virus_local
2021/09/22 16:54:17 debug SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER'
2021/09/22 16:54:17 debug SELECT * FROM system.disks;
2021/09/22 16:54:17 debug done                      backup=FFF operation=create table=xuanji_dwd.dwd_dayu_log_skylar_virus_local
2021/09/22 16:54:17 debug create metadata           backup=FFF operation=create table=xuanji_dwd.dwd_dayu_log_skylar_virus_local
2021/09/22 16:54:17  info done                      backup=FFF operation=create table=xuanji_dwd.dwd_dayu_log_skylar_virus_local
2021/09/22 16:54:17 debug create data               backup=FFF operation=create table=xuanji_dwd.dwd_knowledge_qvd_vuln
2021/09/22 16:54:17 debug SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER'
2021/09/22 16:54:17 debug SELECT * FROM system.disks;
2021/09/22 16:54:17 debug skipped                   backup=FFF engine=Distributed operation=create table=xuanji_dwd.dwd_knowledge_qvd_vuln
2021/09/22 16:54:17 debug create metadata           backup=FFF operation=create table=xuanji_dwd.dwd_knowledge_qvd_vuln
2021/09/22 16:54:17  info done                      backup=FFF operation=create table=xuanji_dwd.dwd_knowledge_qvd_vuln
2021/09/22 16:54:17 debug SELECT value FROM `system`.`build_options` where name='VERSION_DESCRIBE'
2021/09/22 16:54:17  info done                      backup=FFF operation=create

this nothing of shadow moved log

bunkiedc commented 2 years ago

Hi, has anything been done about this? I'm seeing the same behavior, where I get nothing in the log (and no errors), and the shadow directory never show up in the backup directory. I'm running clickhouse-server cluster in docker containers, and running clickhouse-backup as a docker container on the server that hosts the clickhouse-server container. I do see that the shadows directories are being created on my host server in /var/lib/docker/volumes/clickhouse/_data/shadow/, but these never appear in my backup directory, so the restore recreates the table, but always without data. It appears that clickhouse-backup is not able to locate the shadow directory when it's mapped as a docker volume? Should the disk_mapping variable be mapped to the docker volume rather than /var/lib/clickhouse?

thanks, david

Slach commented 2 years ago

change backup local directory, have no sense backup directory shall be on the same disk and file system and local backup /var/lib/clickhouse/backup/shadow/db/table/part_name/* it's a just hardlinks to /var/lib/clickhouse/data/db/table/part_name/*

I'm running clickhouse-server cluster in docker containers use the same volumes for clickhouse-server and clickhose-backup container example docker-compose

services:
clickhouse:
image: clickhouse/clickhouse-server:latest
volumes:
- /var/lib/clickhouse:/var/lib/clickhouse
clickouse_backup:
image: altinity/clickhouse-backup:latest
volumes:
- /var/lib/clickhouse:/var/lib/clickhouse          
bunkiedc commented 2 years ago

ah, thanks, I figured is something simple like that... It works now.

A few questions if you don't mind: 1) I'd like to have these backup's placed some place else, not on the same server.. Should these just be copied? I saw the note about rsync... is that the recommended method? 2) Is the variable "backups_to_keep_local" to set the total number of backups kept locally, and the program deletes/cleans up on a rolling basis (so it's keeping just the list X number of backups)? 3) If enabling, for example, an S3 bucket, does it only get a copy of the current "create" (i.e. a backup of the backup) and unaffected by the variable "backups_to_keep_local", but I will still get a local copy? In other words, can I use remote for the backup I was referring to in number 1, while keeping a small number local?

thanks again for your quick response and the work you guys are doing!! best, david

Slach commented 2 years ago

I'd like to have these backup's placed some place else, not on the same server.. Should these just be copied? I saw the note about rsync... is that the recommended method?

yes, rsync is good enough on planned 2.x release we implements remote_storage: custom which could help to automate this approach

Is the variable "backups_to_keep_local" to set the total number of backups kept locally, and the program deletes/cleans up on a rolling basis (so it's keeping just the list X number of backups)?

yes, when this variable is not zero, then clickhouse-backup will delete old backups during execute create command

If enabling, for example, an S3 bucket,

What exactly you mean? Do you mean remote_storage: s3? or something else?

Slach commented 2 years ago

If enabling, for example, an S3 bucket, does it only get a copy of the current "create" (i.e. a backup of the backup) and unaffected by the variable "backups_to_keep_local", but I will still get a local copy? In other words, can I use remote for the backup I was referring to in number 1, while keeping a small number local?

yes, you can

backups_to_keep_local: 1
backups_to_keep_remote: 7

ususal workflow

bunkiedc commented 2 years ago

yes, remote storage. I was wondering if, for example, I could keep a rolling 30 days local, and on the last day of each month, do a local and remote copy with the same create command. Using remote storage for long term backup storage. So then my remote storage would end up being a copy of each month end. Then I could use the "backups_to_keep_remote", set to 12, to keep a rolling year in backups.

Slach commented 2 years ago

Avoid to store lot of local backups, you just will allocate useless disk space

unfortunately, only simple keep last X backup retention policy currently implements you are welcome to make PR