Closed LukaszMarchewka closed 4 months ago
let's figure out with your setup
could you share
clickhouse-backup print-config
without sensitive credentials?
looks like do you try to use bitnami docker container? is it bitnami clickhouse helm chart? or just container?
could you share
cat /bitnami/clickhouse/data/backup/incremental_2024-07-10T10-00-00/metadata/otel/otel_traces.json
configuration
general:
remote_storage: s3
max_file_size: 1073741824
backups_to_keep_local: -1
backups_to_keep_remote: 308
log_level: info
allow_empty_backups: false
download_concurrency: 1
upload_concurrency: 1
upload_max_bytes_per_second: 0
download_max_bytes_per_second: 0
object_disk_server_side_copy_concurrency: 32
use_resumable_state: true
restore_schema_on_cluster: default
upload_by_part: true
download_by_part: true
restore_database_mapping: {}
restore_table_mapping: {}
retries_on_failure: 3
retries_pause: 30s
watch_interval: 1h
full_interval: 24h
watch_backup_name_template: shard{shard}-{type}-{time:20060102150405}
sharded_operation_mode: none
cpu_nice_priority: 15
io_nice_priority: idle
rbac_backup_always: true
rbac_conflict_resolution: recreate
retriesduration: 30s
watchduration: 1h0m0s
fullduration: 24h0m0s
clickhouse:
username: admin
password: admin
host: localhost
port: 9000
disk_mapping: {}
skip_tables:
- system.*
- INFORMATION_SCHEMA.*
- information_schema.*
skip_table_engines: []
timeout: 5m
freeze_by_part: false
freeze_by_part_where: ""
use_embedded_backup_restore: false
embedded_backup_disk: ""
backup_mutations: true
restore_as_attach: false
check_parts_columns: true
secure: false
skip_verify: false
sync_replicated_tables: true
log_sql_queries: true
config_dir: /bitnami/clickhouse/etc
restart_command: exec:systemctl restart clickhouse-server
ignore_not_exists_error_during_freeze: true
check_replicas_before_attach: true
tls_key: ""
tls_cert: ""
tls_ca: ""
max_connections: 1
debug: false
s3:
access_key: xxx
secret_key: xxx
bucket: xxx
endpoint: xxx
region: us-east-1
acl: private
assume_role_arn: ""
force_path_style: false
path: ""
object_disk_path: ""
disable_ssl: true
compression_level: 1
compression_format: tar
sse: ""
sse_kms_key_id: ""
sse_customer_algorithm: ""
sse_customer_key: ""
sse_customer_key_md5: ""
sse_kms_encryption_context: ""
disable_cert_verification: false
use_custom_storage_class: false
storage_class: STANDARD
custom_storage_class_map: {}
concurrency: 1
part_size: 0
max_parts_count: 10000
allow_multipart_download: false
object_labels: {}
request_payer: ""
check_sum_algorithm: ""
debug: false
gcs:
credentials_file: ""
credentials_json: ""
credentials_json_encoded: ""
embedded_access_key: ""
embedded_secret_key: ""
skip_credentials: false
bucket: ""
path: ""
object_disk_path: ""
compression_level: 1
compression_format: tar
debug: false
force_http: false
endpoint: ""
storage_class: STANDARD
object_labels: {}
custom_storage_class_map: {}
client_pool_size: 6
chunk_size: 0
cos:
url: ""
timeout: 2m
secret_id: ""
secret_key: ""
path: ""
compression_format: tar
compression_level: 1
debug: false
api:
listen: localhost:7171
enable_metrics: true
enable_pprof: false
username: ""
password: ""
secure: false
certificate_file: ""
private_key_file: ""
ca_cert_file: ""
ca_key_file: ""
create_integration_tables: false
integration_tables_host: ""
allow_parallel: false
complete_resumable_after_restart: true
watch_is_main_process: false
ftp:
address: ""
timeout: 2m
username: ""
password: ""
tls: false
skip_tls_verify: false
path: ""
object_disk_path: ""
compression_format: tar
compression_level: 1
concurrency: 6
debug: false
sftp:
address: ""
port: 22
username: ""
password: ""
key: ""
path: ""
object_disk_path: ""
compression_format: tar
compression_level: 1
concurrency: 6
debug: false
azblob:
endpoint_schema: https
endpoint_suffix: core.windows.net
account_name: ""
account_key: ""
sas: ""
use_managed_identity: false
container: ""
path: ""
object_disk_path: ""
compression_level: 1
compression_format: tar
sse_key: ""
buffer_size: 0
buffer_count: 3
max_parts_count: 256
timeout: 4h
debug: false
custom:
upload_command: ""
download_command: ""
list_command: ""
delete_command: ""
command_timeout: 4h
commandtimeoutduration: 4h0m0s
helm-chart: bitnami image: altinity
@Slach I see your comment on https://github.com/Altinity/clickhouse-backup/issues/949 , is it possible that the error is caused by concurrent execution? I run backups in the background.
I run backups in the background.
how exactly you run backup?
via REST API / SQL or directly execute clickhouse-backup
binary inside pod with clickhouse-server
?
could you share
cat /bitnami/clickhouse/data/backup/incremental_2024-07-10T10-00-00/metadata/otel/otel_traces.json
?
I use clickhouse-backup
binary tool
metadata file (I hope you asked about this one):
{
"files": {
"default": [
"default_20240710_2207_2526_27.tar",
"default_20240710_2527_2765_26.tar",
"default_20240710_2766_2771_1.tar",
"default_20240710_2772_2772_0.tar",
"default_20240710_2773_2773_0.tar",
"default_20240710_2774_2774_0.tar"
]
},
"table": "otel_traces",
"database": "otel",
"parts": {
"default": [
{
"name": "20240703_0_0_0",
"required": true
},
{
"name": "20240703_10_10_0",
"required": true
},
{
"name": "20240703_1_1_0",
"required": true
},
{
"name": "20240703_2_2_0",
"required": true
},
{
"name": "20240703_3_3_0",
"required": true
},
{
"name": "20240703_4_4_0",
"required": true
},
{
"name": "20240703_5_5_0",
"required": true
},
{
"name": "20240703_6_6_0",
"required": true
},
{
"name": "20240703_7_7_0",
"required": true
},
{
"name": "20240703_8_8_0",
"required": true
},
{
"name": "20240703_9_9_0",
"required": true
},
{
"name": "20240704_0_0_0",
"required": true
},
{
"name": "20240704_10_10_0",
"required": true
},
{
"name": "20240704_1_1_0",
"required": true
},
{
"name": "20240704_2_2_0",
"required": true
},
{
"name": "20240704_3_3_0",
"required": true
},
{
"name": "20240704_4_4_0",
"required": true
},
{
"name": "20240704_5_5_0",
"required": true
},
{
"name": "20240704_6_6_0",
"required": true
},
{
"name": "20240704_7_7_0",
"required": true
},
{
"name": "20240704_8_8_0",
"required": true
},
{
"name": "20240704_9_9_0",
"required": true
},
{
"name": "20240705_0_0_0",
"required": true
},
{
"name": "20240705_1_1_0",
"required": true
},
{
"name": "20240705_2_2_0",
"required": true
},
{
"name": "20240705_3_3_0",
"required": true
},
{
"name": "20240705_4_4_0",
"required": true
},
{
"name": "20240705_5_5_0",
"required": true
},
{
"name": "20240705_6_6_0",
"required": true
},
{
"name": "20240705_7_7_0",
"required": true
},
{
"name": "20240710_0_1282_29",
"required": true
},
{
"name": "20240710_1283_2206_29",
"required": true
},
{
"name": "20240710_2207_2526_27"
},
{
"name": "20240710_2527_2765_26"
},
{
"name": "20240710_2766_2771_1"
},
{
"name": "20240710_2772_2772_0"
},
{
"name": "20240710_2773_2773_0"
},
{
"name": "20240710_2774_2774_0"
}
]
},
"query": "CREATE TABLE otel.otel_traces UUID '1ad4ab1f-6762-3ca3-b540-5d86aa3dca82' (`Timestamp` DateTime64(9) CODEC(Delta(8), ZSTD(1)), `ResourceRef` UUID CODEC(ZSTD(1)), `TraceId` String CODEC(ZSTD(1)), `SpanId` String CODEC(ZSTD(1)), `ParentSpanId` String CODEC(ZSTD(1)), `TraceState` String CODEC(ZSTD(1)), `SpanName` LowCardinality(String) CODEC(ZSTD(1)), `SpanKind` LowCardinality(String) CODEC(ZSTD(1)), `ServiceName` LowCardinality(String) CODEC(ZSTD(1)), `ScopeName` String CODEC(ZSTD(1)), `ScopeVersion` String CODEC(ZSTD(1)), `SpanAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)), `Duration` Int64 CODEC(ZSTD(1)), `StatusCode` LowCardinality(String) CODEC(ZSTD(1)), `StatusMessage` String CODEC(ZSTD(1)), `SpanParentType` LowCardinality(String) CODEC(ZSTD(1)), `Events.Timestamp` Array(DateTime64(9)) CODEC(ZSTD(1)), `Events.Name` Array(LowCardinality(String)) CODEC(ZSTD(1)), `Events.Attributes` Array(Map(LowCardinality(String), String)) CODEC(ZSTD(1)), `Links.TraceId` Array(String) CODEC(ZSTD(1)), `Links.SpanId` Array(String) CODEC(ZSTD(1)), `Links.TraceState` Array(String) CODEC(ZSTD(1)), `Links.Attributes` Array(Map(LowCardinality(String), String)) CODEC(ZSTD(1)), INDEX idx_trace_id TraceId TYPE bloom_filter(0.001) GRANULARITY 1, INDEX idx_span_attr_key mapKeys(SpanAttributes) TYPE bloom_filter(0.01) GRANULARITY 1, INDEX idx_span_attr_value mapValues(SpanAttributes) TYPE bloom_filter(0.01) GRANULARITY 1, INDEX idx_duration Duration TYPE minmax GRANULARITY 1, INDEX idx_scope_name ScopeName TYPE bloom_filter(0.001) GRANULARITY 1, INDEX idx_scope_version ScopeVersion TYPE bloom_filter(0.001) GRANULARITY 1) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}') PARTITION BY toDate(Timestamp) ORDER BY (ServiceName, ResourceRef, SpanName, toUnixTimestamp(Timestamp), TraceId) TTL toDateTime(Timestamp) + toIntervalDay(7) SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1",
"size": {
"default": 11470627
},
"total_bytes": 1803692416,
"metadata_only": false
}
hm, ok, looks weird 20240710_2207_2526_27 should present after download
ok. let's try from scratch, please share full logs as text
clickhouse-backup delete local incremental_2024-07-10T10-00-00
LOG_LEVEL=debug clickhouse-backup download incremental_2024-07-10T10-00-00
It works now (as I have mentioned it happen sometimes). Maybe it is caused by backups running in the background. I'm going to disable it and test.
Maybe two different instances of clickhouse-backup
got race condition(like *.s3 cache)
and one instance drop content of partially downloaded backup ... but not sure
ok. create new issue if it will happen again
restore_remote operation isn't deterministic and sometimes throws errors like
error one of Download go-routine return error: /bitnami/clickhouse/data/backup/incremental_2024-07-10T10-00-00/shadow/otel/otel_traces/default/20240710_2207_2526_27 not found after download backup
The missing file exists on the remote storage: but not on the local file system:
I have executed the restore operation after few hours and everything was fine (the same backup), the missing directory has been created on the local storage:
I have analyzed logs and I see there the failed execution starts new download but never complete it: Link to the source code starting these downloads: https://github.com/Altinity/clickhouse-backup/blob/be523f5ba78046df70eeee0578c3d18d5c0a6e8b/pkg/backup/download.go#L644