zilliztech / milvus-backup

Backup and restore tool for Milvus
Apache License 2.0
129 stars 46 forks source link

[Bug]: Restore failed after 15 minutes #189

Closed Meikelrizkyhartawan closed 1 year ago

Meikelrizkyhartawan commented 1 year ago

Current Behavior

I restored a collection with a size of about 16gb and failed when more than 15 minutes, i use milvus installed with helm, I have set importTaksExpiration in milvus.yaml with a value of 90000 or about 25 hours on configmaps, but it still doesn't work even the milvus.yaml already set as 90000 in rootcoord, are there any settings that need to be added to milvus backup?

milvus.yaml value :

rootCoord: address: milvus-ground-rootcoord dmlChannelNum: 16 enableActiveStandby: true importTaskExpiration: 90000 importTaskRetention: 86400 maxDatabaseNum: 64 maxPartitionNum: 4096 minSegmentSizeToEnableIndex: 1024 port: 53100

rootcoord error : [2023/09/03 09:50:07.501 +00:00] [INFO] [rootcoord/root_coord.go:2005] ["an import task has failed, marking DataNode available and resending import task"] ["task ID"=443982561981094687] [2023/09/03 09:50:07.501 +00:00] [INFO] [rootcoord/root_coord.go:1978] ["a DataNode is no longer busy after processing task"] ["dataNode ID"=178] ["task ID"=443982561981094687]

milvus backup error ; [2023/09/03 16:50:10.734 +07:00] [ERROR] [core/backup_impl_restore_backup.go:574] ["fail or timeout to bulk insert"] [error="bulk insert fail, info: import task was canceled"] [errorVerbose="bulk insert fail, info: import task was canceled\n(1) attached stack trace\n -- stack trace:\n | github.com/zilliztech/milvus-backup/core.(BackupContext).watchBulkInsertState\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:603\n | github.com/zilliztech/milvus-backup/core.(BackupContext).executeBulkInsert\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:572\n | github.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreCollectionTask.func4\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:478\n | github.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreCollectionTask\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:502\n | github.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreBackupTask\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:327\n | runtime.goexit\n | \t/usr/lib/golang/src/runtime/asm_amd64.s:1571\nWraps: (2) bulk insert fail, info: import task was canceled\nError types: (1) withstack.withStack (2) errutil.leafError"] [taskId=443982561981094687] [targetCollectionName=online_news_indobert] [partitionName=_default] [stack="github.com/zilliztech/milvus-backup/core.(BackupContext).executeBulkInsert\n\t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:574\ngithub.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreCollectionTask.func4\n\t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:478\ngithub.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreCollectionTask\n\t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:502\ngithub.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreBackupTask\n\t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:327"] [2023/09/03 16:50:10.734 +07:00] [ERROR] [core/backup_impl_restore_backup.go:480] ["fail to bulk insert to partition"] [error="bulk insert fail, info: import task was canceled"] [errorVerbose="bulk insert fail, info: import task was canceled\n(1) attached stack trace\n -- stack trace:\n | github.com/zilliztech/milvus-backup/core.(BackupContext).watchBulkInsertState\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:603\n | github.com/zilliztech/milvus-backup/core.(BackupContext).executeBulkInsert\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:572\n | github.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreCollectionTask.func4\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:478\n | github.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreCollectionTask\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:502\n | github.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreBackupTask\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:327\n | runtime.goexit\n | \t/usr/lib/golang/src/runtime/asm_amd64.s:1571\nWraps: (2) bulk insert fail, info: import task was canceled\nError types: (1) withstack.withStack (2) errutil.leafError"] [backupCollectionName=online_news_indobert] [targetCollectionName=online_news_indobert] [partition=_default] [stack="github.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreCollectionTask.func4\n\t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:480\ngithub.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreCollectionTask\n\t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:502\ngithub.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreBackupTask\n\t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:327"] [2023/09/03 16:50:10.735 +07:00] [ERROR] [core/backup_impl_restore_backup.go:504] ["fail to (copy and) bulkinsert data"] [error="bulk insert fail, info: import task was canceled"] [errorVerbose="bulk insert fail, info: import task was canceled\n(1) attached stack trace\n -- stack trace:\n | github.com/zilliztech/milvus-backup/core.(BackupContext).watchBulkInsertState\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:603\n | github.com/zilliztech/milvus-backup/core.(BackupContext).executeBulkInsert\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:572\n | github.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreCollectionTask.func4\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:478\n | github.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreCollectionTask\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:502\n | github.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreBackupTask\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:327\n | runtime.goexit\n | \t/usr/lib/golang/src/runtime/asm_amd64.s:1571\nWraps: (2) bulk insert fail, info: import task was canceled\nError types: (1) withstack.withStack (2) errutil.leafError"] [backupCollectionName=online_news_indobert] [targetCollectionName=online_news_indobert] [partition=_default] [stack="github.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreCollectionTask\n\t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:504\ngithub.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreBackupTask\n\t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:327"] [2023/09/03 16:50:10.735 +07:00] [ERROR] [core/backup_impl_restore_backup.go:329] ["executeRestoreCollectionTask failed"] [TargetDBName=default] [TargetCollectionName=online_news_indobert] [error="bulk insert fail, info: import task was canceled"] [errorVerbose="bulk insert fail, info: import task was canceled\n(1) attached stack trace\n -- stack trace:\n | github.com/zilliztech/milvus-backup/core.(BackupContext).watchBulkInsertState\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:603\n | github.com/zilliztech/milvus-backup/core.(BackupContext).executeBulkInsert\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:572\n | github.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreCollectionTask.func4\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:478\n | github.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreCollectionTask\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:502\n | github.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreBackupTask\n | \t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:327\n | runtime.goexit\n | \t/usr/lib/golang/src/runtime/asm_amd64.s:1571\nWraps: (2) bulk insert fail, info: import task was canceled\nError types: (1) withstack.withStack (2) errutil.leafError"] [stack="github.com/zilliztech/milvus-backup/core.(BackupContext).executeRestoreBackupTask\n\t/opt/milvus/milvus-backup/core/backup_impl_restore_backup.go:329"]

Expected Behavior

No response

Steps To Reproduce

1.restore a backup with size 16GB
2.the error happen with rootcoordinator logs failed to attempt task

Environment

No response

Anything else?

No response

xiaofan-luan commented 1 year ago

an import task has failed

an import task has failed

could you find full log of milvus why the import task is failed?

Meikelrizkyhartawan commented 1 year ago

this is the log of full log of rootCoordinator pod :

[2023/09/03 09:50:00.950 +00:00] [INFO] [rootcoord/root_coord.go:1963] ["RootCoord receive import state report"] ["task ID"=443982561981094687] ["import state"=ImportStarted] [2023/09/03 09:50:00.950 +00:00] [INFO] [rootcoord/import_manager.go:538] ["import manager update task import result"] [taskID=443982561981094687] [2023/09/03 09:50:00.950 +00:00] [INFO] [rootcoord/import_manager.go:577] ["importManager update task info"] [toPersistImportTaskInfo="id:443982561981094687 datanode_id:178 collection_id:443982561981094682 partition_id:443982561981094683 channel_names:\"by-dev-rootcoord-dml_13_443982561981094682v0\" files:\"backup-agustus-atas/backup6/binlogs/insert_log/443315592509697084/443315592509697085/\" files:\"\" create_ts:1693733705 state: collection_name:\"online_news_indobert\" partition_name:\"_default\" infos:<key:\"backup\" value:\"true\" > infos:<key:\"end_ts\" value:\"444001037320192\" > infos:<key:\"progress_percent\" value:\"0\" > start_ts:1693733705 "] [2023/09/03 09:50:00.950 +00:00] [INFO] [rootcoord/import_manager.go:840] ["updating import task info in Etcd"] ["task ID"=443982561981094687] [2023/09/03 09:50:07.435 +00:00] [INFO] [rootcoord/root_coord.go:1963] ["RootCoord receive import state report"] ["task ID"=443982561981094687] ["import state"=ImportStarted] [2023/09/03 09:50:07.435 +00:00] [INFO] [rootcoord/import_manager.go:538] ["import manager update task import result"] [taskID=443982561981094687] [2023/09/03 09:50:07.435 +00:00] [INFO] [rootcoord/import_manager.go:577] ["importManager update task info"] [toPersistImportTaskInfo="id:443982561981094687 datanode_id:178 collection_id:443982561981094682 partition_id:443982561981094683 channel_names:\"by-dev-rootcoord-dml_13_443982561981094682v0\" files:\"backup-agustus-atas/backup6/binlogs/insert_log/443315592509697084/443315592509697085/\" files:\"\" create_ts:1693733705 state: collection_name:\"online_news_indobert\" partition_name:\"_default\" infos:<key:\"backup\" value:\"true\" > infos:<key:\"end_ts\" value:\"444001037320192\" > infos:<key:\"progress_percent\" value:\"0\" > start_ts:1693733705 "] [2023/09/03 09:50:07.435 +00:00] [INFO] [rootcoord/import_manager.go:840] ["updating import task info in Etcd"] ["task ID"=443982561981094687] [2023/09/03 09:50:07.494 +00:00] [INFO] [rootcoord/root_coord.go:1963] ["RootCoord receive import state report"] ["task ID"=443982561981094687] ["import state"=ImportFailed] [2023/09/03 09:50:07.494 +00:00] [INFO] [rootcoord/import_manager.go:538] ["import manager update task import result"] [taskID=443982561981094687] [2023/09/03 09:50:07.494 +00:00] [INFO] [rootcoord/import_manager.go:577] ["importManager update task info"] [toPersistImportTaskInfo="id:443982561981094687 datanode_id:178 collection_id:443982561981094682 partition_id:443982561981094683 channel_names:\"by-dev-rootcoord-dml_13_443982561981094682v0\" files:\"backup-agustus-atas/backup6/binlogs/insert_log/443315592509697084/443315592509697085/\" files:\"\" create_ts:1693733705 state:<stateCode:ImportFailed error_message:\"import task was canceled\" > collection_name:\"online_news_indobert\" partition_name:\"_default\" infos:<key:\"backup\" value:\"true\" > infos:<key:\"end_ts\" value:\"444001037320192\" > infos:<key:\"progress_percent\" value:\"0\" > start_ts:1693733705 "] [2023/09/03 09:50:07.494 +00:00] [INFO] [rootcoord/import_manager.go:840] ["updating import task info in Etcd"] ["task ID"=443982561981094687] [2023/09/03 09:50:07.501 +00:00] [INFO] [rootcoord/root_coord.go:2005] ["an import task has failed, marking DataNode available and resending import task"] ["task ID"=443982561981094687] [2023/09/03 09:50:07.501 +00:00] [INFO] [rootcoord/root_coord.go:1978] ["a DataNode is no longer busy after processing task"] ["dataNode ID"=178] ["task ID"=443982561981094687] [2023/09/03 09:51:05.574 +00:00] [INFO] [rootcoord/import_manager.go:177] ["(in cleanupLoop) trying to expire old tasks from memory and Etcd"] [2023/09/03 09:51:05.578 +00:00] [INFO] [rootcoord/import_manager.go:180] ["(in cleanupLoop) start removing bad import segments"] [2023/09/03 09:51:05.582 +00:00] [INFO] [rootcoord/import_manager.go:1012] ["trying to mark segments as dropped"] ["task ID"=443982561981094687] ["segment IDs"="[]"] [2023/09/03 09:51:05.582 +00:00] [INFO] [rootcoord/import_manager.go:611] ["trying to set the import state of an import task"] ["task ID"=443982561981094687] ["target state"=ImportFailedAndCleaned] [2023/09/03 09:51:05.582 +00:00] [INFO] [rootcoord/import_manager.go:840] ["updating import task info in Etcd"] ["task ID"=443982561981094687] [2023/09/03 09:51:05.592 +00:00] [INFO] [rootcoord/import_manager.go:182] ["(in cleanupLoop) start cleaning hanging busy DataNode"] [2023/09/03 09:56:05.574 +00:00] [INFO] [rootcoord/import_manager.go:177] ["(in cleanupLoop) trying to expire old tasks from memory and Etcd"] [2023/09/03 09:56:05.581 +00:00] [INFO] [rootcoord/import_manager.go:180] ["(in cleanupLoop) start removing bad import segments"] [2023/09/03 09:56:05.584 +00:00] [INFO] [rootcoord/import_manager.go:182] ["(in cleanupLoop) start cleaning hanging busy DataNode"] [2023/09/03 09:59:38.593 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=217] [2023/09/03 10:00:40.977 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=202] [2023/09/03 10:01:05.574 +00:00] [INFO] [rootcoord/import_manager.go:177] ["(in cleanupLoop) trying to expire old tasks from memory and Etcd"] [2023/09/03 10:01:05.579 +00:00] [INFO] [rootcoord/import_manager.go:180] ["(in cleanupLoop) start removing bad import segments"] [2023/09/03 10:01:05.584 +00:00] [INFO] [rootcoord/import_manager.go:182] ["(in cleanupLoop) start cleaning hanging busy DataNode"]

xiaofan-luan commented 1 year ago

use this script to export log https://github.com/milvus-io/milvus/blob/a34a9d606ca541e37547ebb0628857c91a192835/deployments/export-log/README.md?plain=1#L33

xiaofan-luan commented 1 year ago

and what milvus version it is?

xiaofan-luan commented 1 year ago

the failed reason should be in datacoord's log

Meikelrizkyhartawan commented 1 year ago

the version of the milvus is 2.2.10

this is the logs in tar milvus-log.tar.gz

here is the datacoord log :

[2023/09/03 10:55:36.203 +00:00] [INFO] [datacoord/meta.go:585] ["meta update: update drop channel segment info"] [channel=by-dev-rootcoord-dml_13_443982561981094682v0] [2023/09/03 10:55:36.203 +00:00] [INFO] [datacoord/meta.go:706] ["meta update: batch save drop segments"] ["drop segments"="[443982561981094756]"] [2023/09/03 10:55:36.211 +00:00] [INFO] [datacoord/kv_catalog.go:509] ["remove flag tombstone added"] [channel=by-dev-rootcoord-dml_13_443982561981094682v0] [2023/09/03 10:55:36.211 +00:00] [INFO] [datacoord/meta.go:622] ["meta update: update drop channel segment info - complete"] [channel=by-dev-rootcoord-dml_13_443982561981094682v0] [2023/09/03 10:55:36.211 +00:00] [INFO] [datacoord/services.go:568] ["DropVChannel plan to remove"] [channel=by-dev-rootcoord-dml_13_443982561981094682v0] [2023/09/03 10:55:36.211 +00:00] [INFO] [datacoord/handler.go:61] [GetDataVChanPositions] [collectionID=443982561981094682] [channel=by-dev-rootcoord-dml_13_443982561981094682v0] [numOfSegments=1] [2023/09/03 10:55:36.211 +00:00] [INFO] [datacoord/handler.go:288] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_13_443982561981094682v0] [posTs=444003388784377857] [posTime=2023/09/03 10:55:13.124 +00:00] [2023/09/03 10:55:36.212 +00:00] [INFO] [datacoord/channel_checker.go:110] ["timer started"] ["watch state"=ToRelease] [nodeID=178] ["channel name"=by-dev-rootcoord-dml_13_443982561981094682v0] ["check interval"=2m0s] [2023/09/03 10:55:36.219 +00:00] [INFO] [datacoord/channel_manager.go:720] ["tickle update, timer delay"] [channel=by-dev-rootcoord-dml_13_443982561981094682v0] [progress=0] [2023/09/03 10:55:36.224 +00:00] [INFO] [datacoord/channel_checker.go:164] ["stop timer for channel"] [channel=by-dev-rootcoord-dml_13_443982561981094682v0] [timerCount=0] [2023/09/03 10:55:36.224 +00:00] [INFO] [datacoord/channel_manager.go:646] ["datanode release channel successfully, will reassign"] [nodeID=178] [channel=by-dev-rootcoord-dml_13_443982561981094682v0] [2023/09/03 10:55:36.224 +00:00] [INFO] [datacoord/channel_checker.go:131] ["stop timer before timeout"] ["watch state"=ToRelease] [nodeID=178] ["channel name"=by-dev-rootcoord-dml_13_443982561981094682v0] ["timeout interval"=2m0s] [runningTimerCount=0] [2023/09/03 10:55:36.226 +00:00] [INFO] [datacoord/channel_manager.go:560] ["remove channel assignment"] ["nodeID to be removed"=178] [channelID=by-dev-rootcoord-dml_13_443982561981094682v0] [collectionID=443982561981094682] [2023/09/03 10:55:36.229 +00:00] [INFO] [datacoord/kv_catalog.go:534] ["removing channel remove path"] [channel=by-dev-rootcoord-dml_13_443982561981094682v0] [2023/09/03 10:55:36.233 +00:00] [INFO] [datacoord/handler.go:415] ["DropChannel succeeded"] [vChannel=by-dev-rootcoord-dml_13_443982561981094682v0] [2023/09/03 10:55:36.233 +00:00] [INFO] [datacoord/channel_manager.go:775] ["removed channel assignment"] ["channel name"=by-dev-rootcoord-dml_13_443982561981094682v0] [2023/09/03 10:55:36.261 +00:00] [INFO] [datacoord/services.go:280] ["received request to get collection statistics"] [traceID=369da2f22a4cfc1a] [module=DataCoord] [2023/09/03 10:55:36.261 +00:00] [INFO] [datacoord/services.go:293] ["success to get collection statistics"] [traceID=369da2f22a4cfc1a] [module=DataCoord] [response="stats:<key:\"row_count\" value:\"2990\" > status:<> "] [2023/09/03 10:55:36.295 +00:00] [INFO] [datacoord/services.go:280] ["received request to get collection statistics"] [traceID=1168a1a298fd62b4] [module=DataCoord] [2023/09/03 10:55:36.295 +00:00] [INFO] [datacoord/services.go:293] ["success to get collection statistics"] [traceID=1168a1a298fd62b4] [module=DataCoord] [response="stats:<key:\"row_count\" value:\"44209\" > status:<> "] [2023/09/03 10:55:36.319 +00:00] [INFO] [datacoord/services.go:280] ["received request to get collection statistics"] [traceID=38d5f2d9ea431aba] [module=DataCoord] [2023/09/03 10:55:36.319 +00:00] [INFO] [datacoord/services.go:293] ["success to get collection statistics"] [traceID=38d5f2d9ea431aba] [module=DataCoord] [response="stats:<key:\"row_count\" value:\"48432\" > status:<> "] [2023/09/03 10:55:36.348 +00:00] [INFO] [datacoord/services.go:280] ["received request to get collection statistics"] [traceID=1d315752c8071dc9] [module=DataCoord] [2023/09/03 10:55:36.348 +00:00] [INFO] [datacoord/services.go:293] ["success to get collection statistics"] [traceID=1d315752c8071dc9] [module=DataCoord] [response="stats:<key:\"row_count\" value:\"60155\" > status:<> "] [2023/09/03 10:55:36.375 +00:00] [INFO] [datacoord/services.go:280] ["received request to get collection statistics"] [traceID=be000fca9b4621] [module=DataCoord] [2023/09/03 10:55:36.375 +00:00] [INFO] [datacoord/services.go:293] ["success to get collection statistics"] [traceID=be000fca9b4621] [module=DataCoord] [response="stats:<key:\"row_count\" value:\"137875\" > status:<> "] [2023/09/03 10:55:36.399 +00:00] [INFO] [datacoord/services.go:280] ["received request to get collection statistics"] [traceID=4642e36e45dac67b] [module=DataCoord] [2023/09/03 10:55:36.399 +00:00] [INFO] [datacoord/services.go:293] ["success to get collection statistics"] [traceID=4642e36e45dac67b] [module=DataCoord] [response="stats:<key:\"row_count\" value:\"63862\" > status:<> "] [2023/09/03 10:55:36.424 +00:00] [INFO] [datacoord/services.go:280] ["received request to get collection statistics"] [traceID=5dacc84e437e0c02] [module=DataCoord] [2023/09/03 10:55:36.424 +00:00] [INFO] [datacoord/services.go:293] ["success to get collection statistics"] [traceID=5dacc84e437e0c02] [module=DataCoord] [response="stats:<key:\"row_count\" value:\"1075461\" > status:<> "] [2023/09/03 10:55:36.452 +00:00] [INFO] [datacoord/services.go:280] ["received request to get collection statistics"] [traceID=d63ed46ad536de7] [module=DataCoord] [2023/09/03 10:55:36.452 +00:00] [INFO] [datacoord/services.go:293] ["success to get collection statistics"] [traceID=d63ed46ad536de7] [module=DataCoord] [response="stats:<key:\"row_count\" value:\"104557\" > status:<> "] [2023/09/03 10:55:36.479 +00:00] [INFO] [datacoord/services.go:280] ["received request to get collection statistics"] [traceID=6ae13f0edad189] [module=DataCoord] [2023/09/03 10:55:36.479 +00:00] [INFO] [datacoord/services.go:293] ["success to get collection statistics"] [traceID=6ae13f0edad189] [module=DataCoord] [response="stats:<key:\"row_count\" value:\"44180\" > status:<> "] [2023/09/03 10:55:36.520 +00:00] [INFO] [datacoord/services.go:280] ["received request to get collection statistics"] [traceID=4915736906bbe15f] [module=DataCoord] [2023/09/03 10:55:36.520 +00:00] [INFO] [datacoord/services.go:293] ["success to get collection statistics"] [traceID=4915736906bbe15f] [module=DataCoord] [response="stats:<key:\"row_count\" value:\"26758\" > status:<> "] [2023/09/03 10:55:36.555 +00:00] [INFO] [datacoord/services.go:280] ["received request to get collection statistics"] [traceID=4b1ea2f4c130fc6f] [module=DataCoord] [2023/09/03 10:55:36.555 +00:00] [INFO] [datacoord/services.go:293] ["success to get collection statistics"] [traceID=4b1ea2f4c130fc6f] [module=DataCoord] [response="stats:<key:\"row_count\" value:\"663271\" > status:<> "] [2023/09/03 10:55:36.585 +00:00] [INFO] [datacoord/services.go:280] ["received request to get collection statistics"] [traceID=7694294375db639c] [module=DataCoord] [2023/09/03 10:55:36.585 +00:00] [INFO] [datacoord/services.go:293] ["success to get collection statistics"] [traceID=7694294375db639c] [module=DataCoord] [response="stats:<key:\"row_count\" value:\"20168\" > status:<> "] [2023/09/03 10:55:36.622 +00:00] [INFO] [datacoord/services.go:280] ["received request to get collection statistics"] [traceID=5083bc3385baa657] [module=DataCoord] [2023/09/03 10:55:36.622 +00:00] [INFO] [datacoord/services.go:293] ["success to get collection statistics"] [traceID=5083bc3385baa657] [module=DataCoord] [response="stats:<key:\"row_count\" value:\"1447050\" > status:<> "] [2023/09/03 10:55:36.651 +00:00] [INFO] [datacoord/services.go:280] ["received request to get collection statistics"] [traceID=7d7b20d4a1f9a315] [module=DataCoord] [2023/09/03 10:55:36.651 +00:00] [INFO] [datacoord/services.go:293] ["success to get collection statistics"] [traceID=7d7b20d4a1f9a315] [module=DataCoord] [response="stats:<key:\"row_count\" value:\"39691\" > status:<> "] [2023/09/03 10:55:36.678 +00:00] [INFO] [datacoord/services.go:280] ["received request to get collection statistics"] [traceID=7a6a884f57772575] [module=DataCoord] [2023/09/03 10:55:36.678 +00:00] [INFO] [datacoord/services.go:293] ["success to get collection statistics"] [traceID=7a6a884f57772575] [module=DataCoord] [response="stats:<key:\"row_count\" value:\"102680\" > status:<> "] [2023/09/03 10:55:36.710 +00:00] [INFO] [datacoord/services.go:280] ["received request to get collection statistics"] [traceID=24f50039acc38d92] [module=DataCoord] [2023/09/03 10:55:36.710 +00:00] [INFO] [datacoord/services.go:293] ["success to get collection statistics"] [traceID=24f50039acc38d92] [module=DataCoord] [response="stats:<key:\"row_count\" value:\"26\" > status:<> "] [2023/09/03 10:55:36.738 +00:00] [INFO] [datacoord/services.go:280] ["received request to get collection statistics"] [traceID=11880406abce9adf] [module=DataCoord] [2023/09/03 10:55:36.738 +00:00] [INFO] [datacoord/services.go:293] ["success to get collection statistics"] [traceID=11880406abce9adf] [module=DataCoord] [response="stats:<key:\"row_count\" value:\"103264\" > status:<> "] [2023/09/03 10:55:36.772 +00:00] [INFO] [datacoord/services.go:280] ["received request to get collection statistics"] [traceID=67c7fe158cdda0b7] [module=DataCoord] [2023/09/03 10:55:36.772 +00:00] [INFO] [datacoord/services.go:293] ["success to get collection statistics"] [traceID=67c7fe158cdda0b7] [module=DataCoord] [response="stats:<key:\"row_count\" value:\"2179\" > status:<> "] [2023/09/03 10:55:36.798 +00:00] [INFO] [datacoord/services.go:280] ["received request to get collection statistics"] [traceID=46f86d0465f6fb87] [module=DataCoord] [2023/09/03 10:55:36.798 +00:00] [INFO] [datacoord/services.go:293] ["success to get collection statistics"] [traceID=46f86d0465f6fb87] [module=DataCoord] [response="stats:<key:\"row_count\" value:\"32381\" > status:<> "] [2023/09/03 10:55:45.205 +00:00] [INFO] [datacoord/services.go:1216] ["receive watch channels request"] [channels="[by-dev-rootcoord-dml_14_443982561981123499v0]"] [2023/09/03 10:55:45.205 +00:00] [INFO] [datacoord/channel_manager.go:434] ["try to update channel watch info with ToWatch state"] [channel="Name: by-dev-rootcoord-dml_14_443982561981123499v0, CollectionID: 443982561981123499, StartPositions: [key:\"by-dev-rootcoord-dml_14\" data:\"\010\272\n\020\346?\030\000 \000\" ]"] [updates="[\"{type=Add,nodeID=178,channels=\\"[by-dev-rootcoord-dml_14_443982561981123499v0]\\"}\"]"] [2023/09/03 10:55:45.205 +00:00] [INFO] [datacoord/handler.go:61] [GetDataVChanPositions] [collectionID=443982561981123499] [channel=by-dev-rootcoord-dml_14_443982561981123499v0] [numOfSegments=0] [2023/09/03 10:55:45.206 +00:00] [INFO] [datacoord/handler.go:270] ["segment position not found, setting channel seek position to channel start position"] [channel=by-dev-rootcoord-dml_14_443982561981123499v0] [posTs=0] [posTime=1970/01/01 00:00:00.000 +00:00] [2023/09/03 10:55:45.206 +00:00] [INFO] [datacoord/handler.go:306] ["channel seek position set from collection start position"] [channel=by-dev-rootcoord-dml_14_443982561981123499v0] [posTs=0] [posTime=1970/01/01 00:00:00.000 +00:00] [2023/09/03 10:55:45.206 +00:00] [INFO] [datacoord/channel_checker.go:110] ["timer started"] ["watch state"=ToWatch] [nodeID=178] ["channel name"=by-dev-rootcoord-dml_14_443982561981123499v0] ["check interval"=2m0s] [2023/09/03 10:55:45.212 +00:00] [INFO] [datacoord/channel_manager.go:720] ["tickle update, timer delay"] [channel=by-dev-rootcoord-dml_14_443982561981123499v0] [progress=0] [2023/09/03 10:55:45.215 +00:00] [INFO] [datacoord/kv_catalog.go:498] ["NON remove flag tombstone added"] [channel=by-dev-rootcoord-dml_14_443982561981123499v0] [2023/09/03 10:55:45.244 +00:00] [INFO] [datacoord/services.go:1348] ["DataCoord receives import request"] ["import task request"="import_task:<collection_id:443982561981123499 partition_id:443982561981123500 channel_names:\"by-dev-rootcoord-dml_14_443982561981123499v0\" task_id:443982561981123504 files:\"backup-agustus-atas/backup6/binlogs/insert_log/443315592509697084/443315592509697085/\" files:\"\" infos:<key:\"backup\" value:\"true\" > infos:<key:\"end_ts\" value:\"444001037320192\" > > "] [2023/09/03 10:55:45.244 +00:00] [INFO] [datacoord/services.go:1366] ["available DataNodes are"] ["node ID"="[175,178,160]"] [2023/09/03 10:55:45.244 +00:00] [INFO] [datacoord/services.go:1372] ["picking a free dataNode"] ["all dataNodes"="[175,178,160]"] ["picking free dataNode with ID"=160] [2023/09/03 10:55:45.390 +00:00] [INFO] [datacoord/channel_checker.go:164] ["stop timer for channel"] [channel=by-dev-rootcoord-dml_14_443982561981123499v0] [timerCount=0] [2023/09/03 10:55:45.390 +00:00] [INFO] [datacoord/channel_manager.go:625] ["datanode successfully watched channel"] [nodeID=178] ["channel name"=by-dev-rootcoord-dml_14_443982561981123499v0] [2023/09/03 10:55:45.390 +00:00] [INFO] [datacoord/channel_checker.go:131] ["stop timer before timeout"] ["watch state"=ToWatch] [nodeID=178] ["channel name"=by-dev-rootcoord-dml_14_443982561981123499v0] ["timeout interval"=2m0s] [runningTimerCount=0] [2023/09/03 10:55:57.080 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443980747550248277"] [2023/09/03 10:55:57.080 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662279806"] [2023/09/03 10:55:57.081 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662281642"] [2023/09/03 10:55:57.082 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443976574305932693"] [2023/09/03 10:55:57.082 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443976574305940041"] [2023/09/03 10:55:57.083 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662279054"] [2023/09/03 10:55:57.083 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662279806"] [2023/09/03 10:55:57.084 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662281041"] [2023/09/03 10:55:57.084 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662281642"] [2023/09/03 10:55:57.085 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443976574305937016"] [2023/09/03 10:55:57.085 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443976574305932415"] [2023/09/03 10:55:57.086 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443980747550248824"] [2023/09/03 10:55:57.086 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662279806"] [2023/09/03 10:55:57.087 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443976574305937702"] [2023/09/03 10:55:57.087 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443976574305939002"] [2023/09/03 10:55:57.088 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662279806"] [2023/09/03 10:55:57.089 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662280991"] [2023/09/03 10:55:57.089 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662281642"] [2023/09/03 10:55:57.089 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662281642"] [2023/09/03 10:55:57.090 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443976574305937448"] [2023/09/03 10:55:57.090 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443976574305937134"] [2023/09/03 10:55:57.091 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443980747550247625"] [2023/09/03 10:55:57.091 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443980747550232385"] [2023/09/03 10:55:57.092 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662281642"] [2023/09/03 10:55:57.092 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662291234"] [2023/09/03 10:55:57.093 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443976574305939679"] [2023/09/03 10:55:57.094 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662281642"] [2023/09/03 10:55:57.094 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443980747550247279"] [2023/09/03 10:55:57.095 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662281642"] [2023/09/03 10:55:57.095 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662279806"] [2023/09/03 10:55:57.096 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662281151"] [2023/09/03 10:55:57.096 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443976574305938179"] [2023/09/03 10:55:57.097 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443980747550249571"] [2023/09/03 10:55:57.097 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662281642"] [2023/09/03 10:55:57.098 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662279054"] [2023/09/03 10:55:58.480 +00:00] [INFO] [datacoord/services.go:147] ["handle assign segment request"] [collectionID=443982561981123499] [partitionID=443982561981123500] [channelName=by-dev-rootcoord-dml_14_443982561981123499v0] [count=1] [isImport=true] ["import task ID"=0] [2023/09/03 10:55:58.482 +00:00] [INFO] [datacoord/meta.go:147] ["meta update: add collection"] ["collection ID"=443982561981123499] [2023/09/03 10:55:58.482 +00:00] [INFO] [datacoord/meta.go:153] ["meta update: add collection - complete"] ["collection ID"=443982561981123499] [2023/09/03 10:55:58.483 +00:00] [INFO] [datacoord/meta.go:263] ["meta update: adding segment"] ["segment ID"=443982561981123575] [2023/09/03 10:55:58.486 +00:00] [INFO] [datacoord/meta.go:275] ["meta update: adding segment - complete"] ["segment ID"=443982561981123575] [2023/09/03 10:55:58.486 +00:00] [INFO] [datacoord/segment_manager.go:381] ["datacoord: estimateTotalRows: "] [CollectionID=443982561981123499] [SegmentID=443982561981123575] [Rows=52184] [Channel=by-dev-rootcoord-dml_14_443982561981123499v0] [2023/09/03 10:55:58.486 +00:00] [INFO] [datacoord/meta.go:880] ["meta update: add allocation"] [segmentID=443982561981123575] [allocation="SegmentID: 443982561981123575, NumOfRows: 1, ExpireTime: 2023-09-13 20:55:58.474 +0000 UTC"] [2023/09/03 10:55:58.489 +00:00] [INFO] [datacoord/meta.go:904] ["meta update: add allocation - complete"] [segmentID=443982561981123575] [2023/09/03 10:55:58.489 +00:00] [INFO] [datacoord/services.go:186] ["success to assign segments"] [collectionID=443982561981123499] [assignments="[{\"SegmentID\":443982561981123575,\"NumOfRows\":1,\"ExpireTime\":444239330272608260}]"] [2023/09/03 10:55:58.494 +00:00] [INFO] [datacoord/server.go:641] ["Updating segment number of rows"] ["segment ID"=443982561981123575] ["old value"=0] ["new value"=6524] [2023/09/03 10:56:57.073 +00:00] [INFO] [datacoord/channel_manager.go:273] ["ChannelManager is not silent, skip channel balance this round"] [2023/09/03 10:56:57.079 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443976574305939002"] [2023/09/03 10:56:57.080 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443976574305937702"] [2023/09/03 10:56:57.081 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443980747550249571"] [2023/09/03 10:56:57.081 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443976574305939679"] [2023/09/03 10:56:57.082 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443976574305937134"] [2023/09/03 10:56:57.082 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662279806"] [2023/09/03 10:56:57.083 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662279806"] [2023/09/03 10:56:57.083 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443980747550247625"] [2023/09/03 10:56:57.084 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443976574305937016"] [2023/09/03 10:56:57.085 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662279054"] [2023/09/03 10:56:57.085 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443980747550248824"] [2023/09/03 10:56:57.086 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662281642"] [2023/09/03 10:56:57.087 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662281041"] [2023/09/03 10:56:57.088 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443980747550248277"]

xiaofan-luan commented 1 year ago

seems that bulkinsert is failed due to timeout

try. to tune rootCoord. importTaskExpiration to longer time, say 3 hour see if it works

Meikelrizkyhartawan commented 1 year ago

i try to tune the importTaskExpiration into 10800 or about 3 hours on configmap but the task still failed like before after 15 minutes, i already restart the milvus deployment to update the configmanp but the error still occurs as before

the rootCoord configuration that inside milvus.yaml:

rootCoord: address: milvus-ground-rootcoord dmlChannelNum: 16 enableActiveStandby: true importTaskExpiration: 10800 importTaskRetention: 86400 maxDatabaseNum: 64 maxPartitionNum: 4096 minSegmentSizeToEnableIndex: 1024 port: 53100

dataCoord logs :

[2023/09/03 15:17:55.775 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662279806"] [2023/09/03 15:17:55.775 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443976574305937702"] [2023/09/03 15:17:55.776 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443980747550247625"] [2023/09/03 15:17:55.776 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662281642"] [2023/09/03 15:17:55.777 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443976574305932693"] [2023/09/03 15:17:55.777 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662279054"] [2023/09/03 15:17:55.778 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443976574305939002"] [2023/09/03 15:17:55.778 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662280991"] [2023/09/03 15:17:55.779 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443980747550248824"] [2023/09/03 15:17:55.779 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662279806"] [2023/09/03 15:17:55.779 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443976574305940041"] [2023/09/03 15:17:55.780 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662281151"] [2023/09/03 15:17:55.780 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662281041"] [2023/09/03 15:18:55.752 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662281642"] [2023/09/03 15:18:55.753 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662279806"] [2023/09/03 15:18:55.753 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="index doesn't exist, collectionID 443964077662280991"]

rootCoord logs :

[2023/09/03 15:15:53.674 +00:00] [INFO] [rootcoord/import_manager.go:840] ["updating import task info in Etcd"] ["task ID"=444006999213752836] [2023/09/03 15:15:53.680 +00:00] [INFO] [rootcoord/import_manager.go:182] ["(in cleanupLoop) start cleaning hanging busy DataNode"] [2023/09/03 15:16:19.779 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=232] [2023/09/03 15:17:19.613 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=264] [2023/09/03 15:17:29.949 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=201] [2023/09/03 15:17:50.814 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=265] [2023/09/03 15:17:54.393 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=244] [2023/09/03 15:18:42.159 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=210] [2023/09/03 15:19:20.786 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=238] [2023/09/03 15:19:57.983 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=235] [2023/09/03 15:20:20.380 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=233] [2023/09/03 15:20:53.664 +00:00] [INFO] [rootcoord/import_manager.go:177] ["(in cleanupLoop) trying to expire old tasks from memory and Etcd"] [2023/09/03 15:20:53.670 +00:00] [INFO] [rootcoord/import_manager.go:180] ["(in cleanupLoop) start removing bad import segments"] [2023/09/03 15:20:53.676 +00:00] [INFO] [rootcoord/import_manager.go:182] ["(in cleanupLoop) start cleaning hanging busy DataNode"] [2023/09/03 15:21:09.974 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=226] [2023/09/03 15:21:21.172 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=224] [2023/09/03 15:21:36.001 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=254] [2023/09/03 15:21:37.365 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=217] [2023/09/03 15:24:17.363 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=215] [2023/09/03 15:25:53.664 +00:00] [INFO] [rootcoord/import_manager.go:177] ["(in cleanupLoop) trying to expire old tasks from memory and Etcd"] [2023/09/03 15:25:53.676 +00:00] [INFO] [rootcoord/import_manager.go:180] ["(in cleanupLoop) start removing bad import segments"] [2023/09/03 15:25:53.680 +00:00] [INFO] [rootcoord/import_manager.go:182] ["(in cleanupLoop) start cleaning hanging busy DataNode"] [2023/09/03 15:26:07.360 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=212] [2023/09/03 15:26:38.962 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=214] [2023/09/03 15:26:56.376 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=227] [2023/09/03 15:27:02.781 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=233] [2023/09/03 15:27:03.017 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=235] [2023/09/03 15:27:26.682 +00:00] [WARN] [rootcoord/timeticksync.go:305] ["rootcoord send tt to all channels too slowly"] [chanNum=16] [span=334]

wayblink commented 1 year ago

@Xredys-19 Hi, please collect the latest milvus log. And can you share the backup metadata too? There is a meta folder in backup data.

Meikelrizkyhartawan commented 1 year ago

after I tried making a new backup and restoring it, it works, i do not hit the timeout, i guess there is a problem with the backup data, the importTaskExpiration is working, thanks