zilliztech / milvus-backup

Backup and restore tool for Milvus
Apache License 2.0
111 stars 38 forks source link

[Bug]: Occurs Error when restore if the collection has more than one partition #199

Closed lentitude2tk closed 9 months ago

lentitude2tk commented 10 months ago

Current Behavior

  1. create collection with more than one partition
  2. create backup
  3. restore backup -> throw error [2023/09/06 07:03:13.405 +00:00] [ERROR] [core/backup_impl_restore_backup.go:504] ["fail to (copy and) bulkinsert data"] [error="bulk insert fail, info: failed to create binlog adapter, error: target partition must be only one"] [errorVerbose="bulk insert fail, info: failed to create binlog adapter, error: target partition must be only one\n(1) attached stack trace\n -- stack trace:\n | [github.com/zilliztech/milvus-backup/core.(*BackupContext).watchBulkInsertState\n](http://github.com/zilliztech/milvus-backup/core.(*BackupContext).watchBulkInsertState/n) | \t/usr/local/cloud-milvus-tool/milvus-backup/core/backup_impl_restore_backup.go:603\n | [github.com/zilliztech/milvus-backup/core.(*BackupContext).executeBulkInsert\n](http://github.com/zilliztech/milvus-backup/core.(*BackupContext).executeBulkInsert/n) | \t/usr/local/cloud-milvus-tool/milvus-backup/core/backup_impl_restore_backup.go:572\n | [github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func4\n](http://github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func4/n) | \t/usr/local/cloud-milvus-tool/milvus-backup/core/backup_impl_restore_backup.go:478\n | [github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask\n](http://github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask/n) | \t/usr/local/cloud-milvus-tool/milvus-backup/core/backup_impl_restore_backup.go:502\n | [github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreBackupTask\n](http://github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreBackupTask/n) | \t/usr/local/cloud-milvus-tool/milvus-backup/core/backup_impl_restore_backup.go:327\n | runtime.goexit\n | \t/home/ec2-user/go/src/runtime/asm_amd64.s:1598\nWraps: (2) bulk insert fail, info: failed to create binlog adapter, error: target partition must be only one\nError types: (1) *withstack.withStack (2) *errutil.leafError"] [backupCollectionName=partitionKeyTest] [targetCollectionName=partitionKeyTest] [partition=_default_0] [stack="[github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask\n\t/usr/local/cloud-milvus-tool/milvus-backup/core/backup_impl_restore_backup.go:504\ngithub.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreBackupTask\n\t/usr/local/cloud-milvus-tool/milvus-backup/core/backup_impl_restore_backup.go:327](http://github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask/n/t/usr/local/cloud-milvus-tool/milvus-backup/core/backup_impl_restore_backup.go:504/ngithub.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreBackupTask/n/t/usr/local/cloud-milvus-tool/milvus-backup/core/backup_impl_restore_backup.go:327)"]

Expected Behavior

restore backup normally

Steps To Reproduce

1. create collection with more than one partition
2. create backup
3. restore backup -> throw error

Environment

No response

Anything else?

No response

zhuwenxing commented 10 months ago

I tried with multiple partitions, and it works well.

There are two partitions in the collection.

[2023/09/08 19:39:38.830 +08:00] [INFO] [core/backup_impl_restore_backup.go:164] ["Collections to restore"] [collection_num=1]
[2023/09/08 19:39:38.855 +08:00] [INFO] [core/backup_impl_restore_backup.go:314] ["executeRestoreBackupTask start"] [backup_name=backup_2Av8rp4o] [backupBucketName=a-bucket] [backupPath=backup/backup_2Av8rp4o]
[2023/09/08 19:39:38.856 +08:00] [INFO] [core/backup_impl_restore_backup.go:357] ["start restore"] [db_name=default] [collection_name=restore_backup_Lm1XLtqH_bak] [backupBucketName=a-bucket] [backupPath=backup/backup_2Av8rp4o]
[2023/09/08 19:39:38.856 +08:00] [INFO] [core/backup_impl_restore_backup.go:383] ["collection schema"] [fields="[{\"ID\":100,\"Name\":\"int64\",\"PrimaryKey\":true,\"AutoID\":false,\"Description\":\"\",\"DataType\":5,\"TypeParams\":{},\"IndexParams\":{},\"IsDynamic\":false,\"IsPartitionKey\":false},{\"ID\":101,\"Name\":\"float\",\"PrimaryKey\":false,\"AutoID\":false,\"Description\":\"\",\"DataType\":10,\"TypeParams\":{},\"IndexParams\":{},\"IsDynamic\":false,\"IsPartitionKey\":false},{\"ID\":102,\"Name\":\"varchar\",\"PrimaryKey\":false,\"AutoID\":false,\"Description\":\"\",\"DataType\":21,\"TypeParams\":{\"max_length\":\"65535\"},\"IndexParams\":{},\"IsDynamic\":false,\"IsPartitionKey\":false},{\"ID\":103,\"Name\":\"json\",\"PrimaryKey\":false,\"AutoID\":false,\"Description\":\"\",\"DataType\":23,\"TypeParams\":{},\"IndexParams\":{},\"IsDynamic\":false,\"IsPartitionKey\":false},{\"ID\":104,\"Name\":\"binary_vector\",\"PrimaryKey\":false,\"AutoID\":false,\"Description\":\"\",\"DataType\":100,\"TypeParams\":{\"dim\":\"128\"},\"IndexParams\":{},\"IsDynamic\":false,\"IsPartitionKey\":false}]"]
[2023/09/08 19:39:38.891 +08:00] [INFO] [core/backup_impl_restore_backup.go:418] ["create collection"] [database=default] [collectionName=restore_backup_Lm1XLtqH_bak] [hasPartitionKey=false]
[2023/09/08 19:39:38.893 +08:00] [INFO] [core/backup_impl_restore_backup.go:451] ["create partition"] [collectionName=restore_backup_Lm1XLtqH_bak] [partitionName=_default]
[2023/09/08 19:39:38.893 +08:00] [INFO] [core/backup_impl_restore_backup.go:625] [getBackupPartitionPaths] [bucketName=a-bucket] [backupPath=backup/backup_2Av8rp4o] [partitionID=444114669170255063]
[2023/09/08 19:39:38.905 +08:00] [INFO] [core/backup_impl_restore_backup.go:593] ["bulkinsert task state"] [id=444114669170255269] [state=2] [state="{\"ID\":444114669170255269,\"State\":2,\"RowCount\":0,\"IDList\":null,\"Infos\":{\"backup\":\"true\",\"collection\":\"restore_backup_Lm1XLtqH_bak\",\"end_ts\":\"444117333049344\",\"failed_reason\":\"\",\"files\":\"backup/backup_2Av8rp4o/binlogs/insert_log/444114669170255062/444114669170255063/,\",\"partition\":\"_default\"},\"CollectionID\":444114669170255265,\"SegmentIDs\":null,\"CreateTs\":1694173178}"] [progress=0] [currentTimestamp=1694173178] [lastUpdateTime=1694173178]
[2023/09/08 19:39:43.909 +08:00] [INFO] [core/backup_impl_restore_backup.go:593] ["bulkinsert task state"] [id=444114669170255269] [state=6] [state="{\"ID\":444114669170255269,\"State\":6,\"RowCount\":1500,\"IDList\":null,\"Infos\":{\"backup\":\"true\",\"collection\":\"restore_backup_Lm1XLtqH_bak\",\"end_ts\":\"444117333049344\",\"failed_reason\":\"\",\"files\":\"backup/backup_2Av8rp4o/binlogs/insert_log/444114669170255062/444114669170255063/,\",\"partition\":\"_default\",\"persist_cost\":\"0.52\",\"progress_percent\":\"100\"},\"CollectionID\":444114669170255265,\"SegmentIDs\":[444114669170255277,444114669170255275],\"CreateTs\":1694173178}"] [progress=100] [currentTimestamp=1694173183] [lastUpdateTime=1694173178]
[2023/09/08 19:39:43.927 +08:00] [INFO] [core/backup_impl_restore_backup.go:451] ["create partition"] [collectionName=restore_backup_Lm1XLtqH_bak] [partitionName=partition__OXSQ1vcI]
[2023/09/08 19:39:43.927 +08:00] [INFO] [core/backup_impl_restore_backup.go:625] [getBackupPartitionPaths] [bucketName=a-bucket] [backupPath=backup/backup_2Av8rp4o] [partitionID=444114669170255067]
[2023/09/08 19:39:43.939 +08:00] [INFO] [core/backup_impl_restore_backup.go:593] ["bulkinsert task state"] [id=444114669170255310] [state=2] [state="{\"ID\":444114669170255310,\"State\":2,\"RowCount\":0,\"IDList\":null,\"Infos\":{\"backup\":\"true\",\"collection\":\"restore_backup_Lm1XLtqH_bak\",\"end_ts\":\"444117333049344\",\"failed_reason\":\"\",\"files\":\"backup/backup_2Av8rp4o/binlogs/insert_log/444114669170255062/444114669170255067/,\",\"partition\":\"partition__OXSQ1vcI\"},\"CollectionID\":444114669170255265,\"SegmentIDs\":null,\"CreateTs\":1694173183}"] [progress=0] [currentTimestamp=1694173183] [lastUpdateTime=1694173183]
[2023/09/08 19:39:48.942 +08:00] [INFO] [core/backup_impl_restore_backup.go:593] ["bulkinsert task state"] [id=444114669170255310] [state=6] [state="{\"ID\":444114669170255310,\"State\":6,\"RowCount\":1500,\"IDList\":null,\"Infos\":{\"backup\":\"true\",\"collection\":\"restore_backup_Lm1XLtqH_bak\",\"end_ts\":\"444117333049344\",\"failed_reason\":\"\",\"files\":\"backup/backup_2Av8rp4o/binlogs/insert_log/444114669170255062/444114669170255067/,\",\"partition\":\"partition__OXSQ1vcI\",\"persist_cost\":\"0.48\",\"progress_percent\":\"100\"},\"CollectionID\":444114669170255265,\"SegmentIDs\":[444114669170255313,444114669170255315],\"CreateTs\":1694173183}"] [progress=100] [currentTimestamp=1694173188] [lastUpdateTime=1694173183]
[2023/09/08 19:39:48.942 +08:00] [INFO] [core/backup_impl_restore_backup.go:335] ["finish restore collection"] [db_name=default] [collection_name=restore_backup_Lm1XLtqH_bak]
lentitude2tk commented 9 months ago

I tried with multiple partitions, and it works well.

There are two partitions in the collection.

[2023/09/08 19:39:38.830 +08:00] [INFO] [core/backup_impl_restore_backup.go:164] ["Collections to restore"] [collection_num=1]
[2023/09/08 19:39:38.855 +08:00] [INFO] [core/backup_impl_restore_backup.go:314] ["executeRestoreBackupTask start"] [backup_name=backup_2Av8rp4o] [backupBucketName=a-bucket] [backupPath=backup/backup_2Av8rp4o]
[2023/09/08 19:39:38.856 +08:00] [INFO] [core/backup_impl_restore_backup.go:357] ["start restore"] [db_name=default] [collection_name=restore_backup_Lm1XLtqH_bak] [backupBucketName=a-bucket] [backupPath=backup/backup_2Av8rp4o]
[2023/09/08 19:39:38.856 +08:00] [INFO] [core/backup_impl_restore_backup.go:383] ["collection schema"] [fields="[{\"ID\":100,\"Name\":\"int64\",\"PrimaryKey\":true,\"AutoID\":false,\"Description\":\"\",\"DataType\":5,\"TypeParams\":{},\"IndexParams\":{},\"IsDynamic\":false,\"IsPartitionKey\":false},{\"ID\":101,\"Name\":\"float\",\"PrimaryKey\":false,\"AutoID\":false,\"Description\":\"\",\"DataType\":10,\"TypeParams\":{},\"IndexParams\":{},\"IsDynamic\":false,\"IsPartitionKey\":false},{\"ID\":102,\"Name\":\"varchar\",\"PrimaryKey\":false,\"AutoID\":false,\"Description\":\"\",\"DataType\":21,\"TypeParams\":{\"max_length\":\"65535\"},\"IndexParams\":{},\"IsDynamic\":false,\"IsPartitionKey\":false},{\"ID\":103,\"Name\":\"json\",\"PrimaryKey\":false,\"AutoID\":false,\"Description\":\"\",\"DataType\":23,\"TypeParams\":{},\"IndexParams\":{},\"IsDynamic\":false,\"IsPartitionKey\":false},{\"ID\":104,\"Name\":\"binary_vector\",\"PrimaryKey\":false,\"AutoID\":false,\"Description\":\"\",\"DataType\":100,\"TypeParams\":{\"dim\":\"128\"},\"IndexParams\":{},\"IsDynamic\":false,\"IsPartitionKey\":false}]"]
[2023/09/08 19:39:38.891 +08:00] [INFO] [core/backup_impl_restore_backup.go:418] ["create collection"] [database=default] [collectionName=restore_backup_Lm1XLtqH_bak] [hasPartitionKey=false]
[2023/09/08 19:39:38.893 +08:00] [INFO] [core/backup_impl_restore_backup.go:451] ["create partition"] [collectionName=restore_backup_Lm1XLtqH_bak] [partitionName=_default]
[2023/09/08 19:39:38.893 +08:00] [INFO] [core/backup_impl_restore_backup.go:625] [getBackupPartitionPaths] [bucketName=a-bucket] [backupPath=backup/backup_2Av8rp4o] [partitionID=444114669170255063]
[2023/09/08 19:39:38.905 +08:00] [INFO] [core/backup_impl_restore_backup.go:593] ["bulkinsert task state"] [id=444114669170255269] [state=2] [state="{\"ID\":444114669170255269,\"State\":2,\"RowCount\":0,\"IDList\":null,\"Infos\":{\"backup\":\"true\",\"collection\":\"restore_backup_Lm1XLtqH_bak\",\"end_ts\":\"444117333049344\",\"failed_reason\":\"\",\"files\":\"backup/backup_2Av8rp4o/binlogs/insert_log/444114669170255062/444114669170255063/,\",\"partition\":\"_default\"},\"CollectionID\":444114669170255265,\"SegmentIDs\":null,\"CreateTs\":1694173178}"] [progress=0] [currentTimestamp=1694173178] [lastUpdateTime=1694173178]
[2023/09/08 19:39:43.909 +08:00] [INFO] [core/backup_impl_restore_backup.go:593] ["bulkinsert task state"] [id=444114669170255269] [state=6] [state="{\"ID\":444114669170255269,\"State\":6,\"RowCount\":1500,\"IDList\":null,\"Infos\":{\"backup\":\"true\",\"collection\":\"restore_backup_Lm1XLtqH_bak\",\"end_ts\":\"444117333049344\",\"failed_reason\":\"\",\"files\":\"backup/backup_2Av8rp4o/binlogs/insert_log/444114669170255062/444114669170255063/,\",\"partition\":\"_default\",\"persist_cost\":\"0.52\",\"progress_percent\":\"100\"},\"CollectionID\":444114669170255265,\"SegmentIDs\":[444114669170255277,444114669170255275],\"CreateTs\":1694173178}"] [progress=100] [currentTimestamp=1694173183] [lastUpdateTime=1694173178]
[2023/09/08 19:39:43.927 +08:00] [INFO] [core/backup_impl_restore_backup.go:451] ["create partition"] [collectionName=restore_backup_Lm1XLtqH_bak] [partitionName=partition__OXSQ1vcI]
[2023/09/08 19:39:43.927 +08:00] [INFO] [core/backup_impl_restore_backup.go:625] [getBackupPartitionPaths] [bucketName=a-bucket] [backupPath=backup/backup_2Av8rp4o] [partitionID=444114669170255067]
[2023/09/08 19:39:43.939 +08:00] [INFO] [core/backup_impl_restore_backup.go:593] ["bulkinsert task state"] [id=444114669170255310] [state=2] [state="{\"ID\":444114669170255310,\"State\":2,\"RowCount\":0,\"IDList\":null,\"Infos\":{\"backup\":\"true\",\"collection\":\"restore_backup_Lm1XLtqH_bak\",\"end_ts\":\"444117333049344\",\"failed_reason\":\"\",\"files\":\"backup/backup_2Av8rp4o/binlogs/insert_log/444114669170255062/444114669170255067/,\",\"partition\":\"partition__OXSQ1vcI\"},\"CollectionID\":444114669170255265,\"SegmentIDs\":null,\"CreateTs\":1694173183}"] [progress=0] [currentTimestamp=1694173183] [lastUpdateTime=1694173183]
[2023/09/08 19:39:48.942 +08:00] [INFO] [core/backup_impl_restore_backup.go:593] ["bulkinsert task state"] [id=444114669170255310] [state=6] [state="{\"ID\":444114669170255310,\"State\":6,\"RowCount\":1500,\"IDList\":null,\"Infos\":{\"backup\":\"true\",\"collection\":\"restore_backup_Lm1XLtqH_bak\",\"end_ts\":\"444117333049344\",\"failed_reason\":\"\",\"files\":\"backup/backup_2Av8rp4o/binlogs/insert_log/444114669170255062/444114669170255067/,\",\"partition\":\"partition__OXSQ1vcI\",\"persist_cost\":\"0.48\",\"progress_percent\":\"100\"},\"CollectionID\":444114669170255265,\"SegmentIDs\":[444114669170255313,444114669170255315],\"CreateTs\":1694173183}"] [progress=100] [currentTimestamp=1694173188] [lastUpdateTime=1694173183]
[2023/09/08 19:39:48.942 +08:00] [INFO] [core/backup_impl_restore_backup.go:335] ["finish restore collection"] [db_name=default] [collection_name=restore_backup_Lm1XLtqH_bak]

You can try to create more partitions. The above scenario is triggered by restore when there are 5 partitions.