Altinity / clickhouse-backup

Tool for easy backup and restore for ClickHouse® using object storage for backup files.
https://altinity.com
Other
1.29k stars 226 forks source link

Restoring backup - Data only on node restored to #973

Closed tman5 closed 3 months ago

tman5 commented 3 months ago

After restoring a backup to a cluster, the data only appears to be on the first node in the cluster. The other nodes don't have any disk space usage from the data. Here is the remote_server config:

remote_servers:
  cluster:
    shard:
      internal_replication: true
      replica:
        host: 201
        port: 9000
      replica:
        host: 202
        port: 9000
      replica:
        host: 203
        port: 9000

On ALL the nodes I ran the following:

clickhouse-backup restore_remote --rm --schema backup_name

On the first node (201) I ran the following:

clickhouse-backup restore_remote --rm  backup_name
Slach commented 3 months ago

could you share grep Replicated -r /var/lib/clickhouse/backup/<backup-name>/metadata/ ?

tman5 commented 3 months ago

That returns nothing. In that metadata folder the only folder in there is default since that backup/restore was the default database

Slach commented 3 months ago

if you don't have Replicated* tables why you define internal_replication: true?

if you don't have Replicated* tables then you need to restore data on all replicas

or convert engine=MergeTree toengine=ReplicatedMergeTree` manually https://github.com/Altinity/clickhouse-backup/blob/master/Examples.md#how-to-convert-mergetree-to-replicatedmergetree

tman5 commented 3 months ago

@Slach The clickhouse cluster it was restored from had internal_replication: true set on it. It's 1 shard with 3 replicas. I don't see any replicatedmergetree queries in the metadata. So right now I'm assuming when we make queries to the 2 and 3 nodes it's just going through node 1 to get the data?

Slach commented 3 months ago

So right now I'm assuming when we make queries to the 2 and 3 nodes it's just going through node 1 to get the data?

i don't understand this question

Do you have ReplicatedMergeTree in original cluster?

tman5 commented 3 months ago

I went and looked at the backup from the original cluster and the metadata files and 2 of the 4 tables have ENGINE=MergeTree not ReplicatedMergeTree. So I'm assuming because of that the tables won't replicate