Closed ddddddcf closed 2 years ago
root cause is not related to clickhouse-backup
`dt` DateTime
)ENGINE = MergeTree()
PARTITION BY dt
should replace PARTITION BY toYYYYMM(dt)
every partition is a separate directory prefix in /var/lib/clickhouse/data/db/table/<partition_prefix>_<min_block>_<max_block>_<background_merges_count>/
your schema will produce huge partitions with one row in each partitition
I checked folder, only have two partitions ,because insert data completed within a second,and table divided by seconds
So numbers of partition or partition itself will affect incremental backup?@Slach
Sorry, I mislead with your question Let me try to describe deeper
dt
value, please don't use it in productionSELECT * FROM system.parts WHERE table='backup' FORMAT Vertical
2
, make two hard links for two data parts and upload only second new part (according to --diff-from) to remote storage, require exists local backup with name 1
, do you check backup sizes via clickhouse-backup list
?I probably understand what you mean, so can I think that if I delete the local backup 2 and use the remote backup 2 to recover the data, there will only be 1 / 2 of the data in the table
After clickhouse-backup delete remote 1
(which you use for 2
--diff-from) you will receive error during try to clickhouse-backup restore_remote 2
.
You can't restore 1/2 data. It have no sense.
Backup is not "remote storage" for your data. It's a snapshot which you use to disaster recovery process.
OK,I fully understand,thank you very much Happy New Year!!!
Halo,I'm confuse the backup. I do this:
first insert data
first backup
Then I checked that the size of the two backups is similar And I find then code
"required_backup":"1"
inmetadata.json
. So I think the backup "2" is an incremental backup.So,now I think I should get a table with 1000000 pieces of data. But I found that there are 2000000 pieces of data in the table. So I want to know how it works,thank you very much,best wish to you.