zilliztech / milvus-backup

Backup and restore tool for Milvus
Apache License 2.0
111 stars 38 forks source link

Does milvus-backup support databases yet? #364

Open mkotsalainen opened 1 week ago

mkotsalainen commented 1 week ago

Hi!

I'm running milvus-backup:v0.4.15

When I try to restore a specific collection that lives in a database it doesn't work - I'm only able to restore collections that are in the default database.

When I don't pass in a collection name milvus-backup restores some of the collections that live in databases but not others.

This github commit mentions passing in a datbase parameter but it is still TODO right?

So what is the status of this? Can milvus-backup work with other databases other than default?

Thank you.

wayblink commented 1 week ago

@mkotsalainen Hi, it is supported since long time ago. All databases will be backup by default. I‘m not sure what you have meet?Please use ./milvus-backup create -h to see the detail usage. If you still meet some issue, please provide us the detail information.

mkotsalainen commented 1 week ago

@wayblink Yes. I see that the databases are backed up but I can't figure out how to restore them.

Looking at your swagger file, I don't see any way to pass in a database as parameter in backuppb.RestoreBackupRequest? When I pass in a collection it only works when it is in the default database.

This is what I'm talking about

      db_collections:
        description: database and collections to restore. A json string. To support
          database. 2023.7.7

https://github.com/zilliztech/milvus-backup/blob/161a5934665540a4177d602d41c90b38b526dd6c/docs/swagger.yaml#L397

mkotsalainen commented 1 week ago

Here is some more context.

I'm trying to restore a collection called dedw_embeddings in the db dedw.

Since I can't pass in dedw as a parameter to the /restore endpoint I just pass in dedw_embeddings collection_names

milvus milvus-backup-server-547d64f894-5c7cb milvus-backup-server 2024-06-20T12:05:03.490147740+02:00 [2024/06/20 10:05:03.490 +00:00] [INFO] [core/backup_impl_restore_backup.go:28] ["receive RestoreBackupRequest"] [requestId=bb8c6ada-223e-486c-ba41-83ff937286b9] [backupName=backup_20240619_1408] [onlyMeta=false] [restoreIndex=false] [useAutoIndex=false] [dropExistCollection=false] [dropExistIndex=false] [skipCreateCollection=false] [collections="[dedw_embeddings]"] [CollectionSuffix=] [CollectionRenames=null] [async=true] [bucketName=] [path=] [databaseCollections=]
milvus milvus-backup-server-547d64f894-5c7cb milvus-backup-server 2024-06-20T12:05:03.490173468+02:00 [2024/06/20 10:05:03.490 +00:00] [INFO] [core/backup_context.go:200] ["receive GetBackupRequest"] [requestId=9000ba00-2eec-11ef-a17b-8648b4be2bdb] [backupName=backup_20240619_1408] [backupId=] [bucketName=] [path=]
milvus milvus-backup-server-547d64f894-5c7cb milvus-backup-server 2024-06-20T12:05:03.719354161+02:00 [2024/06/20 10:05:03.690 +00:00] [INFO] [core/backup_context.go:277] ["finish GetBackupRequest"] [requestId=9000ba00-2eec-11ef-a17b-8648b4be2bdb] [backupName=backup_20240619_1408] [backupId=] [bucketName=] [path=] [resp="requestId:\"9000ba00-2eec-11ef-a17b-8648b4be2bdb\" msg:\"success\" data:<id:\"fcb70dec-95ab-4d00-bcf3-dcc9e846d241\" state_code:BACKUP_SUCCESS start_time:1718806082109 end_time:1718806432298 name:\"backup_20240619_1408\" collection_backups:<id:\"fcb70dec-95ab-4d00-bcf3-dcc9e846d241\" start_time:1718806083 end_time:1718806317 collection_id:446748240558194207 db_name:\"ierte\" collection_name:\"ierte_20240125_1138\" schema:<name:\"ierte_20240125_1138\" description:\"ierte\" fields:<fieldID:100 name:\"id\" is_primary_key:true data_type:VarChar type_params:<key:\"max_length\" value:\"255\" > >
….[snip]….
collection_backups:<id:\"fcb70dec-95ab-4d00-bcf3-dcc9e846d241\" start_time:1718806085 end_time:1718806235 collection_id:449120998903386073 db_name:\"dedw\" collection_name:\"dedw_embeddings\" schema:<name:\"dedw_embeddings\" description:\"dedw_embeddings\" fields:<fieldID:100 name:\"id\" is_primary_key:true data_type:VarChar type_params:<key:\"max_length\" value:\"255\" > > fields:<fieldID:101 name:\"vec\" data_type:FloatVector type_params:<key:\"dim\" value:\"768\" > > fields:<fieldID:102 name:\"language\" data_type:VarChar type_params:<key:\"max_length\" value:\"32\" > > fields:<fieldID:103 name:\"
…[snip]….
 value:\"8\" > params:<key:\"efConstruction\" value:\"16\" > params:<key:\"index_type\" value:\"HNSW\" > params:<key:\"metric_type\" value:\"IP\" > > load_state:\"Loaded\" backup_physical_timestamp:1718806090 > size:26892150076 milvus_version:\"v2.3.3\" > "]
…[snip]…

milvus milvus-backup-server-547d64f894-5c7cb milvus-backup-server 2024-06-20T12:14:09.294651346+02:00 [2024/06/20 10:14:09.294 +00:00] [INFO] [core/backup_impl_restore_backup.go:186] ["Collections to restore"] [collection_num=0]
milvus milvus-backup-server-547d64f894-5c7cb milvus-backup-server 2024-06-20T12:14:09.294654911+02:00 [2024/06/20 10:14:09.294 +00:00] [INFO] [core/backup_impl_restore_backup.go:340] ["Start collection level restore pool"] [parallelism=1]
milvus milvus-backup-server-547d64f894-5c7cb milvus-backup-server 2024-06-20T12:14:09.294659237+02:00 [2024/06/20 10:14:09.294 +00:00] [INFO] [core/backup_impl_restore_backup.go:344] ["executeRestoreBackupTask start"] [backup_name=backup_20240619_1408] [backupBucketName=peach-dev-milvus] [backupPath=snapshot/backup_20240619_1408]
milvus milvus-backup-server-547d64f894-5c7cb milvus-backup-server 2024-06-20T12:14:09.294670120+02:00 [GIN] 2024/06/20 - 10:14:09 | 200 |  229.204616ms |     10.1.122.17 | POST     "/api/v1/restore"

Here are the logs from my client that is calling backup-server, it prints response that it gets:

Restoring snapshot backup_20240619_1408 {'backup_name': 'backup_20240619_1408', 'async': False, 'collection_names': ['dedw_embeddings']}
{'data': {'id': 'restore_2024_06_20_10_15_44_923334005',
          'restored_size': 0,
          'start_time': 1718878544,
          'to_restore_size': 0},
 'msg': 'success',
 'requestId': 'd18b1c7e-90bb-453c-b8f8-e6e7888f7ce5'}
wayblink commented 1 week ago

@mkotsalainen Well, seems swagger is a bit old. It is supported. set db_collections to make it. for example:

curl --location --request POST 'http://localhost:8080/api/v1/restore' \
--header 'Content-Type: application/json' \
--data-raw '{
    "backup_name":"mybackup",
    "id":"abc",
    "collection_suffix": "_recover",
    "db_collections": {"db1":["collection1"],"db2":["collection2","colleciton3"]},
    "async":true
}'
mkotsalainen commented 1 week ago

Thanks @wayblink This got me a bit further but unfortunately now the server crashes. In your example above you supply an id. I don't (because I'm not sure what that id should be). I just supply db_collections param.

Here are some logs that might be useful.

+ milvus milvus-backup-server-77c8b8bf75-7wzgf › milvus-backup-server
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:42:04.283998617+02:00 0.0.1 (Built on unknown from Git SHA unknown)
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:42:04.284038699+02:00 config:backup.yaml
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:42:04.284494419+02:00 [2024/06/20 17:42:04.284 +00:00] [INFO] [logutil/logutil.go:165] ["Log directory"] [configDir=]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:42:04.284498438+02:00 [2024/06/20 17:42:04.284 +00:00] [INFO] [logutil/logutil.go:166] ["Set log file to "] [path=logs/backup.log]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:42:04.284762579+02:00 [2024/06/20 17:42:04.284 +00:00] [INFO] [core/backup_context.go:96] ["{Base:0xc000433900 MaxSegmentGroupSize:2147483648 BackupCollectionParallelism:1 BackupCopyDataParallelism:128 RestoreParallelism:1 KeepTempFiles:false GcPauseEnable:false GcPauseSeconds:7200 GcPauseAddress:http://localhost:9091}"]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:42:04.284768796+02:00 [2024/06/20 17:42:04.284 +00:00] [INFO] [core/backup_context.go:97] ["{Base:0xc000433900 Enabled:true DebugMode:false SimpleResponse:true}"]

...snip...

milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:42:57.456499464+02:00 [2024/06/20 17:42:57.456 +00:00] [INFO] [core/backup_impl_restore_backup.go:850] [getBackupPartitionPaths] [bucketName=peach-dev-milvus] [backupPath=snapshot/backup_20240619_1408] [partitionID=449120998903386074] [groupId=449120998903839632]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:42:57.472546224+02:00 [2024/06/20 17:42:57.472 +00:00] [INFO] [core/backup_impl_restore_backup.go:743] ["execute bulk insert"] [db=dedw] [collection=dedw_embeddings] [partition=_default] [files="[snapshot/backup_20240619_1408/binlogs/insert_log/449120998903386073/449120998903386074/449120998903836176/,snapshot/backup_20240619_1408/binlogs/delta_log/449120998903386073/449120998903386074/449120998903836176/]"] [endTime=0]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:42:57.702740643+02:00 [2024/06/20 17:42:57.702 +00:00] [INFO] [core/backup_impl_restore_backup.go:794] ["bulkinsert task state"] [id=450427922694049950] [state=2] [state="{\"ID\":450427922694049950,\"State\":2,\"RowCount\":0,\"IDList\":null,\"Infos\":{\"backup\":\"true\",\"collection\":\"dedw_embeddings\",\"failed_reason\":\"\",\"files\":\"snapshot/backup_20240619_1408/binlogs/insert_log/449120998903386073/449120998903386074/449120998903836176/,snapshot/backup_20240619_1408/binlogs/delta_log/449120998903386073/449120998903386074/449120998903836176/\",\"partition\":\"_default\"},\"CollectionID\":450427922694049938,\"SegmentIDs\":null,\"CreateTs\":1718905377}"] [progress=0] [currentTimestamp=1718905377] [lastUpdateTime=1718905377]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:43:02.706829415+02:00 [2024/06/20 17:43:02.706 +00:00] [INFO] [core/backup_impl_restore_backup.go:794] ["bulkinsert task state"] [id=450427922694049950] [state=1] [state="{\"ID\":450427922694049950,\"State\":1,\"RowCount\":0,\"IDList\":null,\"Infos\":{\"backup\":\"true\",\"collection\":\"dedw_embeddings\",\"failed_reason\":\"fail to initialize in-memory segment data for shard id 0\",\"files\":\"snapshot/backup_20240619_1408/binlogs/insert_log/449120998903386073/449120998903386074/449120998903836176/,snapshot/backup_20240619_1408/binlogs/delta_log/449120998903386073/449120998903386074/449120998903836176/\",\"partition\":\"_default\",\"progress_percent\":\"0\"},\"CollectionID\":450427922694049938,\"SegmentIDs\":null,\"CreateTs\":1718905377}"] [progress=0] [currentTimestamp=1718905382] [lastUpdateTime=1718905377]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:43:02.706930925+02:00 [2024/06/20 17:43:02.706 +00:00] [ERROR] [core/backup_impl_restore_backup.go:775] ["fail or timeout to bulk insert"] [error="bulk insert fail, info: fail to initialize in-memory segment data for shard id 0"] [errorVerbose="bulk insert fail, info: fail to initialize in-memory segment data for shard id 0\n(1) attached stack trace\n  -- stack trace:\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).watchBulkInsertState\n  | \t/app/core/backup_impl_restore_backup.go:804\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).executeBulkInsert\n  | \t/app/core/backup_impl_restore_backup.go:773\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func3\n  | \t/app/core/backup_impl_restore_backup.go:588\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func7\n  | \t/app/core/backup_impl_restore_backup.go:668\n  | github.com/zilliztech/milvus-backup/internal/common.(*WorkerPool).work.func1\n  | \t/app/internal/common/workerpool.go:70\n  | golang.org/x/sync/errgroup.(*Group).Go.func1\n  | \t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\n  | runtime.goexit\n  | \t/usr/local/go/src/runtime/asm_amd64.s:1571\nWraps: (2) bulk insert fail, info: fail to initialize in-memory segment data for shard id 0\nError types: (1) *withstack.withStack (2) *errutil.leafError"] [taskId=450427922694049950] [targetCollectionName=dedw_embeddings] [partitionName=_default] [stack="github.com/zilliztech/milvus-backup/core.(*BackupContext).executeBulkInsert\n\t/app/core/backup_impl_restore_backup.go:775\ngithub.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func3\n\t/app/core/backup_impl_restore_backup.go:588\ngithub.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func7\n\t/app/core/backup_impl_restore_backup.go:668\ngithub.com/zilliztech/milvus-backup/internal/common.(*WorkerPool).work.func1\n\t/app/internal/common/workerpool.go:70\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75"]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:43:02.707005201+02:00 [2024/06/20 17:43:02.706 +00:00] [ERROR] [core/backup_impl_restore_backup.go:668] ["fail to bulk insert to partition"] [backup_db_name=dedw] [backup_collection_name=dedw] [target_db_name=dedw] [target_collection_name=dedw_embeddings] [partition=_default] [error="bulk insert fail, info: fail to initialize in-memory segment data for shard id 0"] [errorVerbose="bulk insert fail, info: fail to initialize in-memory segment data for shard id 0\n(1) attached stack trace\n  -- stack trace:\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).watchBulkInsertState\n  | \t/app/core/backup_impl_restore_backup.go:804\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).executeBulkInsert\n  | \t/app/core/backup_impl_restore_backup.go:773\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func3\n  | \t/app/core/backup_impl_restore_backup.go:588\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func7\n  | \t/app/core/backup_impl_restore_backup.go:668\n  | github.com/zilliztech/milvus-backup/internal/common.(*WorkerPool).work.func1\n  | \t/app/internal/common/workerpool.go:70\n  | golang.org/x/sync/errgroup.(*Group).Go.func1\n  | \t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\n  | runtime.goexit\n  | \t/usr/local/go/src/runtime/asm_amd64.s:1571\nWraps: (2) bulk insert fail, info: fail to initialize in-memory segment data for shard id 0\nError types: (1) *withstack.withStack (2) *errutil.leafError"] [stack="github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func7\n\t/app/core/backup_impl_restore_backup.go:668\ngithub.com/zilliztech/milvus-backup/internal/common.(*WorkerPool).work.func1\n\t/app/internal/common/workerpool.go:70\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75"]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:43:02.707713915+02:00 [2024/06/20 17:43:02.707 +00:00] [ERROR] [core/backup_impl_restore_backup.go:357] ["executeRestoreCollectionTask failed"] [TargetDBName=dedw] [TargetCollectionName=dedw_embeddings] [error="bulk insert fail, info: fail to initialize in-memory segment data for shard id 0"] [errorVerbose="bulk insert fail, info: fail to initialize in-memory segment data for shard id 0\n(1) attached stack trace\n  -- stack trace:\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).watchBulkInsertState\n  | \t/app/core/backup_impl_restore_backup.go:804\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).executeBulkInsert\n  | \t/app/core/backup_impl_restore_backup.go:773\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func3\n  | \t/app/core/backup_impl_restore_backup.go:588\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func7\n  | \t/app/core/backup_impl_restore_backup.go:668\n  | github.com/zilliztech/milvus-backup/internal/common.(*WorkerPool).work.func1\n  | \t/app/internal/common/workerpool.go:70\n  | golang.org/x/sync/errgroup.(*Group).Go.func1\n  | \t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\n  | runtime.goexit\n  | \t/usr/local/go/src/runtime/asm_amd64.s:1571\nWraps: (2) bulk insert fail, info: fail to initialize in-memory segment data for shard id 0\nError types: (1) *withstack.withStack (2) *errutil.leafError"] [stack="github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreBackupTask.func1\n\t/app/core/backup_impl_restore_backup.go:357\ngithub.com/zilliztech/milvus-backup/internal/common.(*WorkerPool).work.func1\n\t/app/internal/common/workerpool.go:70\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75"]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:43:02.707813026+02:00 [2024/06/20 17:43:02.707 +00:00] [ERROR] [core/backup_impl_restore_backup.go:321] ["execute restore collection fail"] [backupId=fcb70dec-95ab-4d00-bcf3-dcc9e846d241] [error="workerpool: execute job bulk insert fail, info: fail to initialize in-memory segment data for shard id 0"] [stack="github.com/zilliztech/milvus-backup/core.(*BackupContext).RestoreBackup\n\t/app/core/backup_impl_restore_backup.go:321\ngithub.com/zilliztech/milvus-backup/core.(*Handlers).handleRestoreBackup\n\t/app/core/backup_server.go:241\ngithub.com/zilliztech/milvus-backup/core.(*Handlers).RegisterRoutesTo.func6\n\t/app/core/backup_server.go:125\ngithub.com/gin-gonic/gin.(*Context).Next\n\t/go/pkg/mod/github.com/gin-gonic/gin@v1.8.1/context.go:173\ngithub.com/gin-gonic/gin.CustomRecoveryWithWriter.func1\n\t/go/pkg/mod/github.com/gin-gonic/gin@v1.8.1/recovery.go:101\ngithub.com/gin-gonic/gin.(*Context).Next\n\t/go/pkg/mod/github.com/gin-gonic/gin@v1.8.1/context.go:173\ngithub.com/gin-gonic/gin.LoggerWithConfig.func1\n\t/go/pkg/mod/github.com/gin-gonic/gin@v1.8.1/logger.go:240\ngithub.com/gin-gonic/gin.(*Context).Next\n\t/go/pkg/mod/github.com/gin-gonic/gin@v1.8.1/context.go:173\ngithub.com/gin-gonic/gin.(*Engine).handleHTTPRequest\n\t/go/pkg/mod/github.com/gin-gonic/gin@v1.8.1/gin.go:616\ngithub.com/gin-gonic/gin.(*Engine).ServeHTTP\n\t/go/pkg/mod/github.com/gin-gonic/gin@v1.8.1/gin.go:572\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2916\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1966"]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:43:02.712123890+02:00 [GIN] 2024/06/20 - 17:43:02 | 200 |  6.564680391s |    10.1.202.249 | POST     "/api/v1/restore"
wayblink commented 1 week ago

@mkotsalainen What‘s your milvus version? It is an error happens when restoring data by calling BulkInsert.

wayblink commented 1 week ago

Thanks @wayblink This got me a bit further but unfortunately now the server crashes. In your example above you supply an id. I don't (because I'm not sure what that id should be). I just supply db_collections param.

Here are some logs that might be useful.

+ milvus milvus-backup-server-77c8b8bf75-7wzgf › milvus-backup-server
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:42:04.283998617+02:00 0.0.1 (Built on unknown from Git SHA unknown)
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:42:04.284038699+02:00 config:backup.yaml
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:42:04.284494419+02:00 [2024/06/20 17:42:04.284 +00:00] [INFO] [logutil/logutil.go:165] ["Log directory"] [configDir=]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:42:04.284498438+02:00 [2024/06/20 17:42:04.284 +00:00] [INFO] [logutil/logutil.go:166] ["Set log file to "] [path=logs/backup.log]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:42:04.284762579+02:00 [2024/06/20 17:42:04.284 +00:00] [INFO] [core/backup_context.go:96] ["{Base:0xc000433900 MaxSegmentGroupSize:2147483648 BackupCollectionParallelism:1 BackupCopyDataParallelism:128 RestoreParallelism:1 KeepTempFiles:false GcPauseEnable:false GcPauseSeconds:7200 GcPauseAddress:http://localhost:9091}"]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:42:04.284768796+02:00 [2024/06/20 17:42:04.284 +00:00] [INFO] [core/backup_context.go:97] ["{Base:0xc000433900 Enabled:true DebugMode:false SimpleResponse:true}"]

...snip...

milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:42:57.456499464+02:00 [2024/06/20 17:42:57.456 +00:00] [INFO] [core/backup_impl_restore_backup.go:850] [getBackupPartitionPaths] [bucketName=peach-dev-milvus] [backupPath=snapshot/backup_20240619_1408] [partitionID=449120998903386074] [groupId=449120998903839632]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:42:57.472546224+02:00 [2024/06/20 17:42:57.472 +00:00] [INFO] [core/backup_impl_restore_backup.go:743] ["execute bulk insert"] [db=dedw] [collection=dedw_embeddings] [partition=_default] [files="[snapshot/backup_20240619_1408/binlogs/insert_log/449120998903386073/449120998903386074/449120998903836176/,snapshot/backup_20240619_1408/binlogs/delta_log/449120998903386073/449120998903386074/449120998903836176/]"] [endTime=0]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:42:57.702740643+02:00 [2024/06/20 17:42:57.702 +00:00] [INFO] [core/backup_impl_restore_backup.go:794] ["bulkinsert task state"] [id=450427922694049950] [state=2] [state="{\"ID\":450427922694049950,\"State\":2,\"RowCount\":0,\"IDList\":null,\"Infos\":{\"backup\":\"true\",\"collection\":\"dedw_embeddings\",\"failed_reason\":\"\",\"files\":\"snapshot/backup_20240619_1408/binlogs/insert_log/449120998903386073/449120998903386074/449120998903836176/,snapshot/backup_20240619_1408/binlogs/delta_log/449120998903386073/449120998903386074/449120998903836176/\",\"partition\":\"_default\"},\"CollectionID\":450427922694049938,\"SegmentIDs\":null,\"CreateTs\":1718905377}"] [progress=0] [currentTimestamp=1718905377] [lastUpdateTime=1718905377]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:43:02.706829415+02:00 [2024/06/20 17:43:02.706 +00:00] [INFO] [core/backup_impl_restore_backup.go:794] ["bulkinsert task state"] [id=450427922694049950] [state=1] [state="{\"ID\":450427922694049950,\"State\":1,\"RowCount\":0,\"IDList\":null,\"Infos\":{\"backup\":\"true\",\"collection\":\"dedw_embeddings\",\"failed_reason\":\"fail to initialize in-memory segment data for shard id 0\",\"files\":\"snapshot/backup_20240619_1408/binlogs/insert_log/449120998903386073/449120998903386074/449120998903836176/,snapshot/backup_20240619_1408/binlogs/delta_log/449120998903386073/449120998903386074/449120998903836176/\",\"partition\":\"_default\",\"progress_percent\":\"0\"},\"CollectionID\":450427922694049938,\"SegmentIDs\":null,\"CreateTs\":1718905377}"] [progress=0] [currentTimestamp=1718905382] [lastUpdateTime=1718905377]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:43:02.706930925+02:00 [2024/06/20 17:43:02.706 +00:00] [ERROR] [core/backup_impl_restore_backup.go:775] ["fail or timeout to bulk insert"] [error="bulk insert fail, info: fail to initialize in-memory segment data for shard id 0"] [errorVerbose="bulk insert fail, info: fail to initialize in-memory segment data for shard id 0\n(1) attached stack trace\n  -- stack trace:\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).watchBulkInsertState\n  | \t/app/core/backup_impl_restore_backup.go:804\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).executeBulkInsert\n  | \t/app/core/backup_impl_restore_backup.go:773\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func3\n  | \t/app/core/backup_impl_restore_backup.go:588\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func7\n  | \t/app/core/backup_impl_restore_backup.go:668\n  | github.com/zilliztech/milvus-backup/internal/common.(*WorkerPool).work.func1\n  | \t/app/internal/common/workerpool.go:70\n  | golang.org/x/sync/errgroup.(*Group).Go.func1\n  | \t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\n  | runtime.goexit\n  | \t/usr/local/go/src/runtime/asm_amd64.s:1571\nWraps: (2) bulk insert fail, info: fail to initialize in-memory segment data for shard id 0\nError types: (1) *withstack.withStack (2) *errutil.leafError"] [taskId=450427922694049950] [targetCollectionName=dedw_embeddings] [partitionName=_default] [stack="github.com/zilliztech/milvus-backup/core.(*BackupContext).executeBulkInsert\n\t/app/core/backup_impl_restore_backup.go:775\ngithub.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func3\n\t/app/core/backup_impl_restore_backup.go:588\ngithub.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func7\n\t/app/core/backup_impl_restore_backup.go:668\ngithub.com/zilliztech/milvus-backup/internal/common.(*WorkerPool).work.func1\n\t/app/internal/common/workerpool.go:70\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75"]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:43:02.707005201+02:00 [2024/06/20 17:43:02.706 +00:00] [ERROR] [core/backup_impl_restore_backup.go:668] ["fail to bulk insert to partition"] [backup_db_name=dedw] [backup_collection_name=dedw] [target_db_name=dedw] [target_collection_name=dedw_embeddings] [partition=_default] [error="bulk insert fail, info: fail to initialize in-memory segment data for shard id 0"] [errorVerbose="bulk insert fail, info: fail to initialize in-memory segment data for shard id 0\n(1) attached stack trace\n  -- stack trace:\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).watchBulkInsertState\n  | \t/app/core/backup_impl_restore_backup.go:804\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).executeBulkInsert\n  | \t/app/core/backup_impl_restore_backup.go:773\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func3\n  | \t/app/core/backup_impl_restore_backup.go:588\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func7\n  | \t/app/core/backup_impl_restore_backup.go:668\n  | github.com/zilliztech/milvus-backup/internal/common.(*WorkerPool).work.func1\n  | \t/app/internal/common/workerpool.go:70\n  | golang.org/x/sync/errgroup.(*Group).Go.func1\n  | \t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\n  | runtime.goexit\n  | \t/usr/local/go/src/runtime/asm_amd64.s:1571\nWraps: (2) bulk insert fail, info: fail to initialize in-memory segment data for shard id 0\nError types: (1) *withstack.withStack (2) *errutil.leafError"] [stack="github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func7\n\t/app/core/backup_impl_restore_backup.go:668\ngithub.com/zilliztech/milvus-backup/internal/common.(*WorkerPool).work.func1\n\t/app/internal/common/workerpool.go:70\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75"]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:43:02.707713915+02:00 [2024/06/20 17:43:02.707 +00:00] [ERROR] [core/backup_impl_restore_backup.go:357] ["executeRestoreCollectionTask failed"] [TargetDBName=dedw] [TargetCollectionName=dedw_embeddings] [error="bulk insert fail, info: fail to initialize in-memory segment data for shard id 0"] [errorVerbose="bulk insert fail, info: fail to initialize in-memory segment data for shard id 0\n(1) attached stack trace\n  -- stack trace:\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).watchBulkInsertState\n  | \t/app/core/backup_impl_restore_backup.go:804\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).executeBulkInsert\n  | \t/app/core/backup_impl_restore_backup.go:773\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func3\n  | \t/app/core/backup_impl_restore_backup.go:588\n  | github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreCollectionTask.func7\n  | \t/app/core/backup_impl_restore_backup.go:668\n  | github.com/zilliztech/milvus-backup/internal/common.(*WorkerPool).work.func1\n  | \t/app/internal/common/workerpool.go:70\n  | golang.org/x/sync/errgroup.(*Group).Go.func1\n  | \t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\n  | runtime.goexit\n  | \t/usr/local/go/src/runtime/asm_amd64.s:1571\nWraps: (2) bulk insert fail, info: fail to initialize in-memory segment data for shard id 0\nError types: (1) *withstack.withStack (2) *errutil.leafError"] [stack="github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreBackupTask.func1\n\t/app/core/backup_impl_restore_backup.go:357\ngithub.com/zilliztech/milvus-backup/internal/common.(*WorkerPool).work.func1\n\t/app/internal/common/workerpool.go:70\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75"]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:43:02.707813026+02:00 [2024/06/20 17:43:02.707 +00:00] [ERROR] [core/backup_impl_restore_backup.go:321] ["execute restore collection fail"] [backupId=fcb70dec-95ab-4d00-bcf3-dcc9e846d241] [error="workerpool: execute job bulk insert fail, info: fail to initialize in-memory segment data for shard id 0"] [stack="github.com/zilliztech/milvus-backup/core.(*BackupContext).RestoreBackup\n\t/app/core/backup_impl_restore_backup.go:321\ngithub.com/zilliztech/milvus-backup/core.(*Handlers).handleRestoreBackup\n\t/app/core/backup_server.go:241\ngithub.com/zilliztech/milvus-backup/core.(*Handlers).RegisterRoutesTo.func6\n\t/app/core/backup_server.go:125\ngithub.com/gin-gonic/gin.(*Context).Next\n\t/go/pkg/mod/github.com/gin-gonic/gin@v1.8.1/context.go:173\ngithub.com/gin-gonic/gin.CustomRecoveryWithWriter.func1\n\t/go/pkg/mod/github.com/gin-gonic/gin@v1.8.1/recovery.go:101\ngithub.com/gin-gonic/gin.(*Context).Next\n\t/go/pkg/mod/github.com/gin-gonic/gin@v1.8.1/context.go:173\ngithub.com/gin-gonic/gin.LoggerWithConfig.func1\n\t/go/pkg/mod/github.com/gin-gonic/gin@v1.8.1/logger.go:240\ngithub.com/gin-gonic/gin.(*Context).Next\n\t/go/pkg/mod/github.com/gin-gonic/gin@v1.8.1/context.go:173\ngithub.com/gin-gonic/gin.(*Engine).handleHTTPRequest\n\t/go/pkg/mod/github.com/gin-gonic/gin@v1.8.1/gin.go:616\ngithub.com/gin-gonic/gin.(*Engine).ServeHTTP\n\t/go/pkg/mod/github.com/gin-gonic/gin@v1.8.1/gin.go:572\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2916\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1966"]
milvus milvus-backup-server-77c8b8bf75-7wzgf milvus-backup-server 2024-06-20T19:43:02.712123890+02:00 [GIN] 2024/06/20 - 17:43:02 | 200 |  6.564680391s |    10.1.202.249 | POST     "/api/v1/restore"
fail to initialize in-memory segment data for shard id 0

You can get more information from milvus server log about this error. Please upload your milvus log if convenient

mkotsalainen commented 1 week ago

@wayblink I search milvus namespace for error logs and this is some of the erros that I see when I try to restore. I've also enclosed complete logs from the milvus namespace logs.txt . Milvus-server version is v2.3.3 BTW. Milvus-backup server is docker image built a few days ago logs.txt from main branch.

milvus milvus-proxy-7f9c964c89-vmtvw proxy 2024-06-21T20:05:54.735230626+02:00 [2024/06/21 18:05:54.735 +00:00] [INFO] [funcutil/policy.go:43] ["GetExtension fail"] [error="proto: not an extendable proto.Message"]
milvus milvus-proxy-7f9c964c89-vmtvw proxy 2024-06-21T20:05:54.735273270+02:00 [2024/06/21 18:05:54.735 +00:00] [INFO] [proxy/privilege_interceptor.go:72] ["GetPrivilegeExtObj err"] [error="proto: not an extendable proto.Message"]

...snip...

milvus milvus-rootcoord-7798cc6c8-mpggk rootcoord 2024-06-21T20:08:57.193104756+02:00 [2024/06/21 18:08:57.192 +00:00] [INFO] [rootcoord/import_manager.go:601] ["importManager update task info"] [toPersistImportTaskInfo="id:450427922694478318 datanode_id:307 collection_id:450427922694478307 partition_id:450427922694478308 channel_names:\"by-dev-rootcoord-dml_7_450427922694478307v0\" files:\"snapshot/backup_20240619_1408/binlogs/insert_log/449120998903386073/449120998903386074/449120998903836176/\" files:\"snapshot/backup_20240619_1408/binlogs/delta_log/449120998903386073/449120998903386074/449120998903836176/\" create_ts:1718993335 state:<stateCode:ImportFailed error_message:\"fail to initialize in-memory segment data for shard id 0\" > collection_name:\"dedw_embeddings\" partition_name:\"_default\" infos:<key:\"backup\" value:\"true\" > infos:<key:\"progress_percent\" value:\"0\" > start_ts:1718993335 "]

...snip...

milvus milvus-querycoord-77c5b8dc6c-9jgft querycoord 2024-06-20T09:43:24.893804179+02:00 [2024/06/20 07:43:24.893 +00:00] [ERROR] [retry/retry.go:46] ["retry func failed"] ["retry time"=0] [error="rpc error: code = Canceled desc = context canceled"] [stack="github.com/milvus-io/milvus/pkg/util/retry.Do\n\t/go/src/github.com/milvus-io/milvus/pkg/util/retry/retry.go:46\ngithub.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).call\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:429\ngithub.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).Call\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:513\ngithub.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:529\ngithub.com/milvus-io/milvus/internal/distributed/querynode/client.wrapGrpcCall[...]\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:75\ngithub.com/milvus-io/milvus/internal/distributed/querynode/client.(*Client).UnsubDmChannel\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:125\ngithub.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).UnsubDmChannel.func1\n\t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:155\ngithub.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).send\n\t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:276\ngithub.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).UnsubDmChannel\n\t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:152\ngithub.com/milvus-io/milvus/internal/querycoordv2/task.(*Executor).unsubDmChannel\n\t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/executor.go:484\ngithub.com/milvus-io/milvus/internal/querycoordv2/task.(*Executor).executeDmChannelAction\n\t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/executor.go:373\ngithub.com/milvus-io/milvus/internal/querycoordv2/task.(*Executor).Execute.func1\n\t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/executor.go:117"]

...snip...

milvus milvus-querycoord-77c5b8dc6c-9jgft querycoord 2024-06-20T09:43:24.894189947+02:00 [2024/06/20 07:43:24.893 +00:00] [WARN] [grpcclient/client.go:516] ["ClientBase Call grpc call get error"] [role=querynode-290] [address=10.1.127.227:21123] [error="stack trace: /go/src/github.com/milvus-io/milvus/pkg/tracer/stack_trace.go:51 github.com/milvus-io/milvus/pkg/tracer.StackTrace\n/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:515 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).Call\n/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:529 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:75 github.com/milvus-io/milvus/internal/distributed/querynode/client.wrapGrpcCall[...]\n/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:125 github.com/milvus-io/milvus/internal/distributed/querynode/client.(*Client).UnsubDmChannel\n/go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:155 github.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).UnsubDmChannel.func1\n/go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:276 github.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).send\n/go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:152 github.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).UnsubDmChannel\n/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/executor.go:484 github.com/milvus-io/milvus/internal/querycoordv2/task.(*Executor).unsubDmChannel\n/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/executor.go:373 github.com/milvus-io/milvus/internal/querycoordv2/task.(*Executor).executeDmChannelAction: attempt #0: rpc error: code = Canceled desc = context canceled: context canceled"] [errorVerbose="stack trace: /go/src/github.com/milvus-io/milvus/pkg/tracer/stack_trace.go:51 github.com/milvus-io/milvus/pkg/tracer.StackTrace: attempt #0: rpc error: code = Canceled desc = context canceled: context canceled\n(1) attached stack trace\n  -- stack trace:\n  | github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).Call\n  | \t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:515\n  | github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n  | \t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:529\n  | github.com/milvus-io/milvus/internal/distributed/querynode/client.wrapGrpcCall[...]\n  | \t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:75\n  | github.com/milvus-io/milvus/internal/distributed/querynode/client.(*Client).UnsubDmChannel\n  | \t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:125\n  | github.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).UnsubDmChannel.func1\n  | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:155\n  | github.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).send\n  | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:276\n  | github.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).UnsubDmChannel\n  | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:152\n  | github.com/milvus-io/milvus/internal/querycoordv2/task.(*Executor).unsubDmChannel\n  | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/executor.go:484\n  | github.com/milvus-io/milvus/internal/querycoordv2/task.(*Executor).executeDmChannelAction\n  | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/executor.go:373\n  | github.com/milvus-io/milvus/internal/querycoordv2/task.(*Executor).Execute.func1\n  | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/executor.go:117\n  | runtime.goexit\n  | \t/usr/local/go/src/runtime/asm_amd64.s:1598\nWraps: (2) stack trace: /go/src/github.com/milvus-io/milvus/pkg/tracer/stack_trace.go:51 github.com/milvus-io/milvus/pkg/tracer.StackTrace\n  | /go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:515 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).Call\n  | /go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:529 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n  | /go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:75 github.com/milvus-io/milvus/internal/distributed/querynode/client.wrapGrpcCall[...]\n  | /go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:125 github.com/milvus-io/milvus/internal/distributed/querynode/client.(*Client).UnsubDmChannel\n  | /go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:155 github.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).UnsubDmChannel.func1\n  | /go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:276 github.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).send\n  | /go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:152 github.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).UnsubDmChannel\n  | /go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/executor.go:484 github.com/milvus-io/milvus/internal/querycoordv2/task.(*Executor).unsubDmChannel\n  | /go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/executor.go:373 github.com/milvus-io/milvus/internal/querycoordv2/task.(*Executor).executeDmChannelAction\nWraps: (3) attempt #0: rpc error: code = Canceled desc = context canceled\nWraps: (4) context canceled\nError types: (1) *withstack.withStack (2) *errutil.withPrefix (3) merr.multiErrors (4) *errors.errorString"]
milvus milvus-querycoord-77c5b8dc6c-9jgft querycoord 2024-06-20T09:43:24.894394756+02:00 [2024/06/20 07:43:24.894 +00:00] [WARN] [task/executor.go:486] ["failed to unsubscribe channel, it may be a false failure"] [taskID=1718372481731] [collectionID=450427922693603988] [replicaID=450460702763909125] [channel=by-dev-rootcoord-dml_2_450427922693603988v0] [node=290] [source=balance_checker] [error="stack trace: /go/src/github.com/milvus-io/milvus/pkg/tracer/stack_trace.go:51 github.com/milvus-io/milvus/pkg/tracer.StackTrace\n/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:515 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).Call\n/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:529 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:75 github.com/milvus-io/milvus/internal/distributed/querynode/client.wrapGrpcCall[...]\n/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:125 github.com/milvus-io/milvus/internal/distributed/querynode/client.(*Client).UnsubDmChannel\n/go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:155 github.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).UnsubDmChannel.func1\n/go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:276 github.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).send\n/go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:152 github.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).UnsubDmChannel\n/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/executor.go:484 github.com/milvus-io/milvus/internal/querycoordv2/task.(*Executor).unsubDmChannel\n/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/executor.go:373 github.com/milvus-io/milvus/internal/querycoordv2/task.(*Executor).executeDmChannelAction: attempt #0: rpc error: code = Canceled desc = context canceled: context canceled"] [errorVerbose="stack trace: /go/src/github.com/milvus-io/milvus/pkg/tracer/stack_trace.go:51 github.com/milvus-io/milvus/pkg/tracer.StackTrace: attempt #0: rpc error: code = Canceled desc = context canceled: context canceled\n(1) attached stack trace\n  -- stack trace:\n  | github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).Call\n  | \t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:515\n  | github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n  | \t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:529\n  | github.com/milvus-io/milvus/internal/distributed/querynode/client.wrapGrpcCall[...]\n  | \t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:75\n  | github.com/milvus-io/milvus/internal/distributed/querynode/client.(*Client).UnsubDmChannel\n  | \t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:125\n  | github.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).UnsubDmChannel.func1\n  | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:155\n  | github.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).send\n  | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:276\n  | github.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).UnsubDmChannel\n  | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:152\n  | github.com/milvus-io/milvus/internal/querycoordv2/task.(*Executor).unsubDmChannel\n  | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/executor.go:484\n  | github.com/milvus-io/milvus/internal/querycoordv2/task.(*Executor).executeDmChannelAction\n  | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/executor.go:373\n  | github.com/milvus-io/milvus/internal/querycoordv2/task.(*Executor).Execute.func1\n  | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/executor.go:117\n  | runtime.goexit\n  | \t/usr/local/go/src/runtime/asm_amd64.s:1598\nWraps: (2) stack trace: /go/src/github.com/milvus-io/milvus/pkg/tracer/stack_trace.go:51 github.com/milvus-io/milvus/pkg/tracer.StackTrace\n  | /go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:515 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).Call\n  | /go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:529 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n  | /go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:75 github.com/milvus-io/milvus/internal/distributed/querynode/client.wrapGrpcCall[...]\n  | /go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:125 github.com/milvus-io/milvus/internal/distributed/querynode/client.(*Client).UnsubDmChannel\n  | /go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:155 github.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).UnsubDmChannel.func1\n  | /go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:276 github.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).send\n  | /go/src/github.com/milvus-io/milvus/internal/querycoordv2/session/cluster.go:152 github.com/milvus-io/milvus/internal/querycoordv2/session.(*QueryCluster).UnsubDmChannel\n  | /go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/executor.go:484 github.com/milvus-io/milvus/internal/querycoordv2/task.(*Executor).unsubDmChannel\n  | /go/src/github.com/milvus-io/milvus/internal/querycoordv2/task/executor.go:373 github.com/milvus-io/milvus/internal/querycoordv2/task.(*Executor).executeDmChannelAction\nWraps: (3) attempt #0: rpc error: code = Canceled desc = context canceled\nWraps: (4) context canceled\nError types: (1) *withstack.withStack (2) *errutil.withPrefix (3) merr.multiErrors (4) *errors.errorString"]
wayblink commented 1 day ago

@mkotsalainen Hi 2.3.3 is a old version. Please upgrade to 2.3.18 if convenient. It is hard to find the root cause only based on the log. Please offer your backup meta. There is folder 'meta' in the backup path.