Altinity / clickhouse-backup

Tool for easy backup and restore for ClickHouse® using object storage for backup files.
https://altinity.com
Other
1.29k stars 227 forks source link

Cannot Restore single table from backup #445

Closed arikodelabs closed 2 years ago

arikodelabs commented 2 years ago

Hello, hope you are well. As you may see from the title I am unable to restore just one single table from the backup , I get the following error "2022/05/19 15:12:49.778962 error no have found schemas by bms_qa_democlient.point_time_series in backup_2022-05-09"

I will provide screenshots that I am using the correct schema, database and table for the command: "clickhouse-backup restore -t 'bms_qa_democlient.point_time_series' backup_2022-05-09"

image

However a complete backup restore works just fine!

Clickhouse Version is: 22.4.5.9

Clickhouse-backup Version is : 1.4.0

Thanks in advance

Best Regards

Ari Berisha

Slach commented 2 years ago

Could you share clickhouse-backup print-config result ? Try to remove singe quotes after -t ?

arikodelabs commented 2 years ago

Hello, thanks for reaching out!

Yes I've tried without quotes . single quotes, double qoutes but to no avail Here is the conf

general:
  remote_storage: none
  max_file_size: 0
  disable_progress_bar: true
  backups_to_keep_local: 0
  backups_to_keep_remote: 0
  log_level: info
  allow_empty_backups: false
  download_concurrency: 1
  upload_concurrency: 1
  restore_schema_on_cluster: ""
  upload_by_part: true
  download_by_part: true
clickhouse:
  username: default
  password: ""
  host: localhost
  port: 9000
  disk_mapping: {}
  skip_tables:
  - system.*
  - INFORMATION_SCHEMA.*
  - information_schema.*
  timeout: 5m
  freeze_by_part: false
  secure: false
  skip_verify: false
  sync_replicated_tables: false
  log_sql_queries: true
  config_dir: /etc/clickhouse-server/
  restart_command: systemctl restart clickhouse-server
  ignore_not_exists_error_during_freeze: true
  tls_key: ""
  tls_cert: ""
  tls_ca: ""
  debug: false
s3:
  access_key: ""
  secret_key: ""
  bucket: ""
  endpoint: ""
  region: us-east-1
  acl: private
  assume_role_arn: ""
  force_path_style: false
  path: ""
  disable_ssl: false
  compression_level: 1
  compression_format: tar
  sse: ""
  disable_cert_verification: false
  storage_class: STANDARD
  concurrency: 1
  part_size: 0
  max_parts_count: 10000
  allow_multipart_download: false
  debug: false
gcs:
  credentials_file: ""
  credentials_json: ""
  bucket: ""
  path: ""
  compression_level: 1
  compression_format: tar
  debug: false
  endpoint: ""
cos:
  url: ""
  timeout: 2m
  secret_id: ""
  secret_key: ""
  path: ""
  compression_format: tar
  compression_level: 1
  debug: false
api:
  listen: localhost:7171
  enable_metrics: true
  enable_pprof: false
  username: ""
  password: ""
  secure: false
  certificate_file: ""
  private_key_file: ""
  create_integration_tables: false
  allow_parallel: false
ftp:
  address: ""
  timeout: 2m
  username: ""
  password: ""
  tls: false
  path: ""
  compression_format: tar
  compression_level: 1
  concurrency: 1
  debug: false
sftp:
  address: ""
  port: 22
  username: ""
  password: ""
  key: ""
  path: ""
  compression_format: tar
  compression_level: 1
  concurrency: 1
  debug: false
azblob:
  endpoint_suffix: core.windows.net
  account_name: ""
  account_key: ""
  sas: ""
  use_managed_identity: false
  container: ""
  path: ""
  compression_level: 1
  compression_format: tar
  sse_key: ""
  buffer_size: 0
  buffer_count: 3
  max_parts_count: 0
Slach commented 2 years ago

looks strange

error could happens in two cases

https://github.com/AlexAkulov/clickhouse-backup/blob/master/pkg/backup/restore.go#L240 in restoreSchema and https://github.com/AlexAkulov/clickhouse-backup/blob/master/pkg/backup/restore.go#L363 in restoreData

according to your logs and shared command call parameters, it look like error happens in restoreData cause in logs you already run restoreSchema and execute DROP + CREATE

could you share ls -la /var/lib/clickhouse/backup/backup_2022-05-09/ ? share contents of /var/lib/clickhouse/backup/backup_2022-05-09/metadata.json ?

and share result of clickhouse-backup list local ?

arikodelabs commented 2 years ago

For the first command here it is:

total 16
drwxr-xr-x  4 root       root       4096 May 16 14:19 .
drwxr-xr-x  3 clickhouse clickhouse 4096 May 16 14:19 ..
drwxr-xr-x 77 root       root       4096 May 16 14:19 metadata
drwxr-xr-x 28 root       root       4096 May 16 14:40 shadow

For the second command there is no metadata.json in the backupdirectory but there is a folder however called metadata with the following contents :

bms_dev_client_admin    bms_qa_3cricle            bms_qa_clienttesting     bms_qa_new_client
bms_dev_client_test     bms_qa_45                 bms_qa_clienttestt       bms_qa_neww
bms_dev_democlient      bms_qa_ali                bms_qa_clientthree       bms_qa_oclient
bms_dev_detroit         bms_qa_apiautomation      bms_qa_db                bms_qa_qa_client
bms_dev_external_idp    bms_qa_apitesting         bms_qa_democlient        bms_qa_qaclient
bms_dev_external_idp_2  bms_qa_asdasd             bms_qa_democlient_trimi  bms_qa_qadetroit
bms_dev_external_only   bms_qa_authtestclient     bms_qa_detroitqa         bms_qa_qr_client
bms_dev_mobile_client   bms_qa_auto               bms_qa_diellza1          bms_qa_quadc
bms_dev_new_client      bms_qa_beli               bms_qa_doni              bms_qa_quadclient
bms_dev_pre_prod        bms_qa_blerim             bms_qa_festim            bms_qa_rr
bms_dev_qa_client       bms_qa_chanclient         bms_qa_heyhey            bms_qa_rritar
bms_dev_tedtkossfsfs    bms_qa_client             bms_qa_honeywell         bms_qa_teest
bms_dev_test            bms_qa_client_automation  bms_qa_infernoclient     bms_qa_test
bms_dev_test34          bms_qa_client_test        bms_qa_kodeclient        bms_qa_test124311
bms_dev_test_client     bms_qa_client_with_idp    bms_qa_leart             bms_qa_test_client
bms_flink_democlient    bms_qa_clientdemo         bms_qa_mackinac          bms_qa_testclient
bms_flink_uiautomation  bms_qa_clientfour         bms_qa_muli              bms_qa_testclientt
bms_local_bmskodeos     bms_qa_clientone          bms_qa_mymsiclient       bms_qa_testing_client
bms_qa_1circle          bms_qa_clienttest         bms_qa_name

and for the third command is:

2022/05/20 08:23:47.105098  info SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER'
2022/05/20 08:23:47.112875  info SELECT * FROM system.disks;
backup_2022-05-09   ???   16/05/2022 14:19:44   local   
Slach commented 2 years ago

drwxr-xr-x 4 root root 4096 May 16 14:19 . drwxr-xr-x 3 clickhouse clickhouse 4096 May 16 14:19 .. drwxr-xr-x 77 root root 4096 May 16 14:19 metadata drwxr-xr-x 28 root root 4096 May 16 14:40 shadow

missed backup_2022-05-09/metadata.json, you backup is broken during restore getLocalBackups will marks your backup as "Legacy" and instead of "table_name.json" will try to read "table_name.sql" which missed and restoreData will failed

How did you create your backup? and how you move it into destination server? you use remote_storage: none in config, it mean you didn't use upload and download command

arikodelabs commented 2 years ago

Here is the metadata.json

https://drive.google.com/file/d/1NZZMNFAegbo94-bE-SipBH_dZQAQATWE/view?usp=sharing

2022/05/20 09:40:49.080614 info SELECT database, name, engine , data_paths , uuid , create_table_query , coalesce(total_bytes, 0) AS total_bytes FROM system.tables WHERE is_temporary = 0 AND match(concat(database,'.',name),'bms_qa_democlient.point_time_series') SETTINGS show_table_uuid_in_table_create_query_if_not_nil=1 2022/05/20 09:40:49.268126 info SELECT sum(bytes_on_disk) as size FROM system.parts WHERE active AND database='bms_qa_democlient' AND table='point_time_series' GROUP BY database, table 2022/05/20 09:40:49.277220 error can't restore 'bms_qa_democlient.point_time_series': error during filepath.Walk for part '202101_1476010_1480989_4': lstat /var/lib/clickhouse/backup/backup_2022-05-16/shadow/bms_qa_democlient/point_time_series/202101_1476010_1480989_4: no such file or directory

The database probably got corrupted by trying the first error I encountered

Slach commented 2 years ago

could you explain, how exaclty you create your backup?

Last error means point_time_series.json contains parts which not present on disk in backup folder

arikodelabs commented 2 years ago

We backup using a yaml file and using this command : clickhouse-backup upload -c clickhouse1.yml ClickHouseDev1 The yaml file is

general:
  remote_storage: gcs
  max_file_size: 1099511627776
  disable_progress_bar: false
  backups_to_keep_local: 0
  backups_to_keep_remote: 0
  log_level: info
  allow_empty_backups: false
clickhouse:
  username: xxxxx
  password: "yyyyyy"
  host: 172.16.33.11
  port: 9000
  disk_mapping: {}
  skip_tables:
  - system.*
  timeout: 30m
  freeze_by_part: false
  secure: false
  skip_verify: false
  sync_replicated_tables: true
gcs:
  credentials_file: "/etc/google/gcs.json"
  credentials_json: ""
  bucket: "clickhouse-1-backup"
  path: "clickhouse-dev-1-backup"
  compression_level: 1
  compression_format: tar
arikodelabs commented 2 years ago

Making a full restore always works just fine! however not while trying to restore only one table. also is it possible to backup one schema and one table only?

When restoring the whole backup, point_time_series.json for example gets filled with data, but not when I specify the table. Thanks!

Slach commented 2 years ago

@arikodelabs

currently your backup name is not ClickHouseDev1 and your clickhouse-backup/config.yaml have remote_storage: none

https://drive.google.com/file/d/1NZZMNFAegbo94-bE-SipBH_dZQAQATWE/view?usp=sharing

this is metadata.json from another backup "backup_name": "backup_2022-05-16", not from backup_2022-05-09

When restoring the whole backup, point_time_series.json for example gets filled with data, but not when I specify the table. Because restoreData not try to read metadata.json during full restore

So, how you download your backup_2022-05-09 to your destination host? When you lost your metadata.json?

arikodelabs commented 2 years ago

We have a few databases which are 20+ GB, I will make another backup, however is it possible to backup simply just one database with 1 table and data in it, and the same will be restored in the secondary instance of ours, if so what is the exact command!

P.S you are a legend

Slach commented 2 years ago

clickhouse-backup create_remote --tables="db.table" one_table_backup will create + upload backup only for one table

clickhouse-backup restore_remote --rm one_table_backup will download + restore

Slach commented 2 years ago

also don't forget clickhouse-backup delete local one_table_backup after successful create_remote and restore_remote