Closed wcollin closed 10 months ago
/assign @wcollin /unassign
I also got the same error as below. It occured just after the rootcoord be launched, and no human triggers. And it seems be triggered every 10s.
milvus_rootcoord.1.435k793iun5o@test | [2023/10/30 16:36:57.971 +08:00] [WARN] [rootcoord/quota_center.go:742] ["failed to get collection rate limit config"] [collectionID=445204369917889111] [error="collection=445204369917889111: collection not found"] [errorVerbose="collection=445204369917889111: collection not found\n(1) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/pkg/util/merr.wrapWithField\n | \t/go/src/github.com/milvus-io/milvus/pkg/util/merr/utils.go:760\n | github.com/milvus-io/milvus/pkg/util/merr.WrapErrCollectionNotFound\n | \t/go/src/github.com/milvus-io/milvus/pkg/util/merr/utils.go:372\n | github.com/milvus-io/milvus/internal/rootcoord.(MetaTable).getLatestCollectionByIDInternal\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/meta_table.go:503\n | github.com/milvus-io/milvus/internal/rootcoord.(MetaTable).getCollectionByIDInternal\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/meta_table.go:511\n | github.com/milvus-io/milvus/internal/rootcoord.(MetaTable).GetCollectionByID\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/meta_table.go:595\n | github.com/milvus-io/milvus/internal/rootcoord.(QuotaCenter).getCollectionLimitConfig\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:740\n | github.com/milvus-io/milvus/internal/rootcoord.(QuotaCenter).checkDiskQuota\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:769\n | github.com/milvus-io/milvus/internal/rootcoord.(QuotaCenter).calculateWriteRates\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:431\n | github.com/milvus-io/milvus/internal/rootcoord.(QuotaCenter).calculateRates\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:680\n | github.com/milvus-io/milvus/internal/rootcoord.(QuotaCenter).run\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:149\n | runtime.goexit\n | \t/usr/local/go/src/runtime/asm_amd64.s:1598\nWraps: (2) collection=445204369917889111\nWraps: (3) collection not found\nError types: (1) withstack.withStack (2) errutil.withPrefix (3) merr.milvusError"] milvus_rootcoord.1.435k793iun5o@test | [2023/10/30 16:36:58.502 +08:00] [INFO] [rootcoord/root_coord.go:1142] ["received request to describe collection"] [traceID=cb17e2b73d1e28c37432f4da3838e7a0] [collectionName=] [dbName=] [id=445291254308930633] [ts=18446744073709551615] [allowUnavailable=false] milvus_rootcoord.1.435k793iun5o@test | [2023/10/30 16:36:58.502 +08:00] [INFO] [rootcoord/root_coord.go:1171] ["done to describe collection"] [traceID=cb17e2b73d1e28c37432f4da3838e7a0] [collectionName=] [dbName=] [id=445291254308930633] [ts=18446744073709551615] [allowUnavailable=false] [collection_id=445291254308930633]
milvus_rootcoord.1.435k793iun5o@test | [2023/10/30 16:37:08.502 +08:00] [INFO] [rootcoord/root_coord.go:1142] ["received request to describe collection"] [traceID=72813ecb1ba36602f006fd2a79330a5e] [collectionName=] [dbName=] [id=445291254308930633] [ts=18446744073709551615] [allowUnavailable=false] milvus_rootcoord.1.435k793iun5o@test | [2023/10/30 16:37:08.502 +08:00] [INFO] [rootcoord/root_coord.go:1171] ["done to describe collection"] [traceID=72813ecb1ba36602f006fd2a79330a5e] [collectionName=] [dbName=] [id=445291254308930633] [ts=18446744073709551615] [allowUnavailable=false] [collection_id=445291254308930633] milvus_rootcoord.1.435k793iun5o@test | [2023/10/30 16:37:09.973 +08:00] [WARN] [rootcoord/quota_center.go:742] ["failed to get collection rate limit config"] [collectionID=445204369917889111] [error="collection=445204369917889111: collection not found"] [errorVerbose="collection=445204369917889111: collection not found\n(1) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/pkg/util/merr.wrapWithField\n | \t/go/src/github.com/milvus-io/milvus/pkg/util/merr/utils.go:760\n | github.com/milvus-io/milvus/pkg/util/merr.WrapErrCollectionNotFound\n | \t/go/src/github.com/milvus-io/milvus/pkg/util/merr/utils.go:372\n | github.com/milvus-io/milvus/internal/rootcoord.(MetaTable).getLatestCollectionByIDInternal\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/meta_table.go:503\n | github.com/milvus-io/milvus/internal/rootcoord.(MetaTable).getCollectionByIDInternal\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/meta_table.go:511\n | github.com/milvus-io/milvus/internal/rootcoord.(MetaTable).GetCollectionByID\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/meta_table.go:595\n | github.com/milvus-io/milvus/internal/rootcoord.(QuotaCenter).getCollectionLimitConfig\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:740\n | github.com/milvus-io/milvus/internal/rootcoord.(QuotaCenter).checkDiskQuota\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:769\n | github.com/milvus-io/milvus/internal/rootcoord.(QuotaCenter).calculateWriteRates\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:431\n | github.com/milvus-io/milvus/internal/rootcoord.(QuotaCenter).calculateRates\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:680\n | github.com/milvus-io/milvus/internal/rootcoord.(QuotaCenter).run\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:149\n | runtime.goexit\n | \t/usr/local/go/src/runtime/asm_amd64.s:1598\nWraps: (2) collection=445204369917889111\nWraps: (3) collection not found\nError types: (1) withstack.withStack (2) errutil.withPrefix (3) merr.milvusError"]
could you please provide full logs for investigation? Could you please refer this doc to export the whole Milvus logs for investigation? For Milvus installed with docker-compose, you can use docker-compose logs > milvus.log to export the logs. @zspishere
@yanliang567 I run Milvus in swarm
cluster, so the log collect script maybe not work for me.
However, i check other services, and ONLY
find errors as below in datacoord
container.
And i think it maybe related to THIS issue. @yanliang567
[2023/10/30 16:48:54.787 +08:00] [INFO] [datacoord/channel_manager.go:744] ["tickle update, timer delay"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [progress=0] [2023/10/30 16:48:54.796 +08:00] [INFO] [datacoord/channel_checker.go:167] ["stop timer for channel"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [timerCount=0] [2023/10/30 16:48:54.796 +08:00] [WARN] [datacoord/channel_manager.go:650] ["datanode watch channel failed or timeout, will release"] [nodeID=602] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [2023/10/30 16:48:54.796 +08:00] [INFO] [datacoord/handler.go:62] [GetDataVChanPositions] [collectionID=445204369917889111] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [numOfSegments=1] [2023/10/30 16:48:54.796 +08:00] [INFO] [datacoord/channel_checker.go:134] ["stop timer before timeout"] ["watch state"=ToWatch] [nodeID=602] [channelName=by-dev-rootcoord-dml_0_445204369917889111v0] ["timeout interval"=5m0s] [runningTimerCount=0] [2023/10/30 16:48:54.796 +08:00] [INFO] [datacoord/handler.go:299] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [posTs=445250214596444161] [posTime=2023/10/28 20:06:16.140 +08:00] [2023/10/30 16:48:54.796 +08:00] [INFO] [datacoord/channel_checker.go:113] ["timer started"] ["watch state"=ToRelease] [nodeID=602] [channelName=by-dev-rootcoord-dml_0_445204369917889111v0] ["check interval"=5m0s] [2023/10/30 16:48:54.804 +08:00] [INFO] [datacoord/channel_manager.go:744] ["tickle update, timer delay"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [progress=0] [2023/10/30 16:48:54.812 +08:00] [INFO] [datacoord/channel_checker.go:167] ["stop timer for channel"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [timerCount=0] [2023/10/30 16:48:54.812 +08:00] [INFO] [datacoord/channel_manager.go:669] ["datanode release channel successfully, will reassign"] [nodeID=602] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [2023/10/30 16:48:54.812 +08:00] [INFO] [datacoord/channel_checker.go:134] ["stop timer before timeout"] ["watch state"=ToRelease] [nodeID=602] [channelName=by-dev-rootcoord-dml_0_445204369917889111v0] ["timeout interval"=5m0s] [runningTimerCount=0] [2023/10/30 16:48:54.813 +08:00] [INFO] [datacoord/policy.go:474] ["AverageReassignPolicy working"] [avaNodesCount=3] [toAssignChannelNum=1] [avaNodesChannelSum=1] [2023/10/30 16:48:54.813 +08:00] [INFO] [datacoord/channel_manager.go:821] ["channel manager reassigning channels"] ["old node ID"=602] [updates="[\"{type=Delete,nodeID=602,channels=\\"[by-dev-rootcoord-dml_0_445204369917889111v0]\\"}\",\"{type=Add,nodeID=600,channels=\\"[by-dev-rootcoord-dml_0_445204369917889111v0]\\"}\"]"] [2023/10/30 16:48:54.813 +08:00] [INFO] [datacoord/handler.go:62] [GetDataVChanPositions] [collectionID=445204369917889111] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [numOfSegments=1] [2023/10/30 16:48:54.813 +08:00] [INFO] [datacoord/handler.go:299] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [posTs=445250214596444161] [posTime=2023/10/28 20:06:16.140 +08:00] [2023/10/30 16:48:54.813 +08:00] [INFO] [datacoord/channel_checker.go:113] ["timer started"] ["watch state"=ToWatch] [nodeID=600] [channelName=by-dev-rootcoord-dml_0_445204369917889111v0] ["check interval"=5m0s] [2023/10/30 16:48:54.840 +08:00] [INFO] [datacoord/channel_manager.go:744] ["tickle update, timer delay"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [progress=0] [2023/10/30 16:48:54.854 +08:00] [INFO] [datacoord/channel_checker.go:167] ["stop timer for channel"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [timerCount=0] [2023/10/30 16:48:54.854 +08:00] [WARN] [datacoord/channel_manager.go:650] ["datanode watch channel failed or timeout, will release"] [nodeID=600] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [2023/10/30 16:48:54.854 +08:00] [INFO] [datacoord/handler.go:62] [GetDataVChanPositions] [collectionID=445204369917889111] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [numOfSegments=1] [2023/10/30 16:48:54.854 +08:00] [INFO] [datacoord/handler.go:299] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [posTs=445250214596444161] [posTime=2023/10/28 20:06:16.140 +08:00] [2023/10/30 16:48:54.854 +08:00] [INFO] [datacoord/channel_checker.go:134] ["stop timer before timeout"] ["watch state"=ToWatch] [nodeID=600] [channelName=by-dev-rootcoord-dml_0_445204369917889111v0] ["timeout interval"=5m0s] [runningTimerCount=0] [2023/10/30 16:48:54.854 +08:00] [INFO] [datacoord/channel_checker.go:113] ["timer started"] ["watch state"=ToRelease] [nodeID=600] [channelName=by-dev-rootcoord-dml_0_445204369917889111v0] ["check interval"=5m0s] [2023/10/30 16:48:54.862 +08:00] [INFO] [datacoord/channel_manager.go:744] ["tickle update, timer delay"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [progress=0] [2023/10/30 16:48:54.871 +08:00] [INFO] [datacoord/channel_checker.go:167] ["stop timer for channel"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [timerCount=0] [2023/10/30 16:48:54.871 +08:00] [INFO] [datacoord/channel_manager.go:669] ["datanode release channel successfully, will reassign"] [nodeID=600] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [2023/10/30 16:48:54.871 +08:00] [INFO] [datacoord/channel_checker.go:134] ["stop timer before timeout"] ["watch state"=ToRelease] [nodeID=600] [channelName=by-dev-rootcoord-dml_0_445204369917889111v0] ["timeout interval"=5m0s] [runningTimerCount=0] [2023/10/30 16:48:54.903 +08:00] [INFO] [datacoord/policy.go:474] ["AverageReassignPolicy working"] [avaNodesCount=3] [toAssignChannelNum=1] [avaNodesChannelSum=1] [2023/10/30 16:48:54.904 +08:00] [INFO] [datacoord/channel_manager.go:821] ["channel manager reassigning channels"] ["old node ID"=600] [updates="[\"{type=Delete,nodeID=600,channels=\\"[by-dev-rootcoord-dml_0_445204369917889111v0]\\"}\",\"{type=Add,nodeID=602,channels=\\"[by-dev-rootcoord-dml_0_445204369917889111v0]\\"}\"]"] [2023/10/30 16:48:54.904 +08:00] [INFO] [datacoord/handler.go:62] [GetDataVChanPositions] [collectionID=445204369917889111] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [numOfSegments=1] [2023/10/30 16:48:54.904 +08:00] [INFO] [datacoord/handler.go:299] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [posTs=445250214596444161] [posTime=2023/10/28 20:06:16.140 +08:00] [2023/10/30 16:48:54.904 +08:00] [INFO] [datacoord/channel_checker.go:113] ["timer started"] ["watch state"=ToWatch] [nodeID=602] [channelName=by-dev-rootcoord-dml_0_445204369917889111v0] ["check interval"=5m0s] [2023/10/30 16:48:54.921 +08:00] [INFO] [datacoord/channel_manager.go:744] ["tickle update, timer delay"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [progress=0] [2023/10/30 16:48:54.929 +08:00] [INFO] [datacoord/channel_checker.go:167] ["stop timer for channel"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [timerCount=0] [2023/10/30 16:48:54.929 +08:00] [WARN] [datacoord/channel_manager.go:650] ["datanode watch channel failed or timeout, will release"] [nodeID=602] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [2023/10/30 16:48:54.929 +08:00] [INFO] [datacoord/handler.go:62] [GetDataVChanPositions] [collectionID=445204369917889111] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [numOfSegments=1] [2023/10/30 16:48:54.929 +08:00] [INFO] [datacoord/handler.go:299] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [posTs=445250214596444161] [posTime=2023/10/28 20:06:16.140 +08:00] [2023/10/30 16:48:54.929 +08:00] [INFO] [datacoord/channel_checker.go:134] ["stop timer before timeout"] ["watch state"=ToWatch] [nodeID=602] [channelName=by-dev-rootcoord-dml_0_445204369917889111v0] ["timeout interval"=5m0s] [runningTimerCount=0] [2023/10/30 16:48:54.929 +08:00] [INFO] [datacoord/channel_checker.go:113] ["timer started"] ["watch state"=ToRelease] [nodeID=602] [channelName=by-dev-rootcoord-dml_0_445204369917889111v0] ["check interval"=5m0s] [2023/10/30 16:48:54.937 +08:00] [INFO] [datacoord/channel_manager.go:744] ["tickle update, timer delay"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [progress=0] [2023/10/30 16:48:54.966 +08:00] [INFO] [datacoord/channel_checker.go:167] ["stop timer for channel"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [timerCount=0] [2023/10/30 16:48:54.966 +08:00] [INFO] [datacoord/channel_manager.go:669] ["datanode release channel successfully, will reassign"] [nodeID=602] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [2023/10/30 16:48:54.966 +08:00] [INFO] [datacoord/channel_checker.go:134] ["stop timer before timeout"] ["watch state"=ToRelease] [nodeID=602] [channelName=by-dev-rootcoord-dml_0_445204369917889111v0] ["timeout interval"=5m0s] [runningTimerCount=0] [2023/10/30 16:48:54.966 +08:00] [INFO] [datacoord/policy.go:474] ["AverageReassignPolicy working"] [avaNodesCount=3] [toAssignChannelNum=1] [avaNodesChannelSum=1] [2023/10/30 16:48:54.966 +08:00] [INFO] [datacoord/channel_manager.go:821] ["channel manager reassigning channels"] ["old node ID"=602] [updates="[\"{type=Delete,nodeID=602,channels=\\"[by-dev-rootcoord-dml_0_445204369917889111v0]\\"}\",\"{type=Add,nodeID=600,channels=\\"[by-dev-rootcoord-dml_0_445204369917889111v0]\\"}\"]"] [2023/10/30 16:48:54.967 +08:00] [INFO] [datacoord/handler.go:62] [GetDataVChanPositions] [collectionID=445204369917889111] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [numOfSegments=1] [2023/10/30 16:48:54.967 +08:00] [INFO] [datacoord/handler.go:299] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [posTs=445250214596444161] [posTime=2023/10/28 20:06:16.140 +08:00] [2023/10/30 16:48:54.967 +08:00] [INFO] [datacoord/channel_checker.go:113] ["timer started"] ["watch state"=ToWatch] [nodeID=600] [channelName=by-dev-rootcoord-dml_0_445204369917889111v0] ["check interval"=5m0s] [2023/10/30 16:48:54.979 +08:00] [INFO] [datacoord/channel_manager.go:744] ["tickle update, timer delay"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [progress=0] [2023/10/30 16:48:54.996 +08:00] [INFO] [datacoord/channel_checker.go:167] ["stop timer for channel"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [timerCount=0] [2023/10/30 16:48:54.996 +08:00] [WARN] [datacoord/channel_manager.go:650] ["datanode watch channel failed or timeout, will release"] [nodeID=600] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [2023/10/30 16:48:54.996 +08:00] [INFO] [datacoord/handler.go:62] [GetDataVChanPositions] [collectionID=445204369917889111] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [numOfSegments=1] [2023/10/30 16:48:54.996 +08:00] [INFO] [datacoord/channel_checker.go:134] ["stop timer before timeout"] ["watch state"=ToWatch] [nodeID=600] [channelName=by-dev-rootcoord-dml_0_445204369917889111v0] ["timeout interval"=5m0s] [runningTimerCount=0] [2023/10/30 16:48:54.996 +08:00] [INFO] [datacoord/handler.go:299] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [posTs=445250214596444161] [posTime=2023/10/28 20:06:16.140 +08:00] [2023/10/30 16:48:54.996 +08:00] [INFO] [datacoord/channel_checker.go:113] ["timer started"] ["watch state"=ToRelease] [nodeID=600] [channelName=by-dev-rootcoord-dml_0_445204369917889111v0] ["check interval"=5m0s] [2023/10/30 16:48:55.004 +08:00] [INFO] [datacoord/channel_manager.go:744] ["tickle update, timer delay"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [progress=0] [2023/10/30 16:48:55.029 +08:00] [INFO] [datacoord/channel_checker.go:167] ["stop timer for channel"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [timerCount=0] [2023/10/30 16:48:55.029 +08:00] [INFO] [datacoord/channel_manager.go:669] ["datanode release channel successfully, will reassign"] [nodeID=600] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [2023/10/30 16:48:55.029 +08:00] [INFO] [datacoord/channel_checker.go:134] ["stop timer before timeout"] ["watch state"=ToRelease] [nodeID=600] [channelName=by-dev-rootcoord-dml_0_445204369917889111v0] ["timeout interval"=5m0s] [runningTimerCount=0] [2023/10/30 16:48:55.029 +08:00] [INFO] [datacoord/policy.go:474] ["AverageReassignPolicy working"] [avaNodesCount=3] [toAssignChannelNum=1] [avaNodesChannelSum=1] [2023/10/30 16:48:55.029 +08:00] [INFO] [datacoord/channel_manager.go:821] ["channel manager reassigning channels"] ["old node ID"=600] [updates="[\"{type=Delete,nodeID=600,channels=\\"[by-dev-rootcoord-dml_0_445204369917889111v0]\\"}\",\"{type=Add,nodeID=602,channels=\\"[by-dev-rootcoord-dml_0_445204369917889111v0]\\"}\"]"] [2023/10/30 16:48:55.029 +08:00] [INFO] [datacoord/handler.go:62] [GetDataVChanPositions] [collectionID=445204369917889111] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [numOfSegments=1] [2023/10/30 16:48:55.030 +08:00] [INFO] [datacoord/handler.go:299] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [posTs=445250214596444161] [posTime=2023/10/28 20:06:16.140 +08:00] [2023/10/30 16:48:55.030 +08:00] [INFO] [datacoord/channel_checker.go:113] ["timer started"] ["watch state"=ToWatch] [nodeID=602] [channelName=by-dev-rootcoord-dml_0_445204369917889111v0] ["check interval"=5m0s] [2023/10/30 16:48:55.046 +08:00] [INFO] [datacoord/channel_manager.go:744] ["tickle update, timer delay"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [progress=0] [2023/10/30 16:48:55.082 +08:00] [INFO] [datacoord/channel_checker.go:167] ["stop timer for channel"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [timerCount=0] [2023/10/30 16:48:55.082 +08:00] [WARN] [datacoord/channel_manager.go:650] ["datanode watch channel failed or timeout, will release"] [nodeID=602] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [2023/10/30 16:48:55.082 +08:00] [INFO] [datacoord/handler.go:62] [GetDataVChanPositions] [collectionID=445204369917889111] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [numOfSegments=1] [2023/10/30 16:48:55.082 +08:00] [INFO] [datacoord/handler.go:299] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_0_445204369917889111v0] [posTs=445250214596444161] [posTime=2023/10/28 20:06:16.140 +08:00] [2023/10/30 16:48:55.082 +08:00] [INFO] [datacoord/channel_checker.go:134] ["stop timer before timeout"] ["watch state"=ToWatch] [nodeID=602] [channelName=by-dev-rootcoord-dml_0_445204369917889111v0] ["timeout interval"=5m0s] [runningTimerCount=0] [2023/10/30 16:48:55.083 +08:00] [INFO] [datacoord/channel_checker.go:113] ["timer started"] ["watch state"=ToRelease] [nodeID=602] [channelName=by-dev-rootcoord-dml_0_445204369917889111v0] ["check interval"=5m0s]
Mmm, swarm is not in the official support list, and we are not able to address the issue without the full milvus logs. :(
@wcollin found the issue in standalone
mode, so i think the error occured not depend on the deployment mode.
And @wcollin maybe offer the full milvus logs.
Hi, i too have the same issue with cluster mode from root coordinator [WARN] [rootcoord/quota_center.go:742] ["failed to get collection rate limit config"] [collectionID=445345169243756901] [error="collection=445345169243756901: collection not found"],
not sure is it any config related. Here i am attaching rootcoordinator full start up logs along with error:
milvus-rootcoordinatorlogs-failedtogetcollectionratelimitconfig.txt
the warning msg is a bit confusing, @bigsheeper could you please double confirm that? /assign @bigsheeper
this seems to be an issue we need to take care of.
the warning msg is a bit confusing, @bigsheeper could you please double confirm that? /assign @bigsheeper
This phenomenon occurs solely when collections are dropped in DataNodes or QueryNodes. It is reasonable if this log is printed momentarily (within a span of 1 minute) after the collection has been dropped. Conversely, it is UNreasonable if this log persists for an extended duration (exceeding 10 minutes) after the collection has been dropped, in this case, there is a possibility of the collection leaking in the DataNodes or QueryNodes. @shaiktas @zspishere
I shall enhance this log as it is not config related. You're right, it is confusing. @yanliang567
@bigsheeper I think there is leakage somewhere.
Anyone could offer the log if possible?
Hi, i too have the same issue with cluster mode from root coordinator [WARN] [rootcoord/quota_center.go:742] ["failed to get collection rate limit config"] [collectionID=445345169243756901] [error="collection=445345169243756901: collection not found"],
not sure is it any config related. Here i am attaching rootcoordinator full start up logs along with error:
milvus-rootcoordinatorlogs-failedtogetcollectionratelimitconfig.txt
Hi @shaiktas, I have a few questions:
And it would be greatly appreciated if you can refer this doc to export the complete Milvus logs for further investigation.
I have the same issue. milvus 2.3.2, standalone, golang sdk. I use upsert() to batch insert data, and the func return nil, but no data was found when I use GetCollectionStatistics() or QueryByPks() collection list:
logs: [Uploading mil3000.log…]()
I have the same issue. milvus 2.3.2, standalone, golang sdk. I use upsert() to batch insert data, and the func return nil, but no data was found when I use GetCollectionStatistics() or QueryByPks() collection list:
logs: Uploading mil3000.log…
I think this is a separate topic. For the default consistency, data maybe invisible for up to 3 seconds. This is what we called as bounded staleness. Usually it takes 200 milliseconds to load on querynode
And GetCollectionStatistics is async, QueryByPks with strong consistency should return you the right log
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
Is there an existing issue for this?
Environment
Current Behavior
[2023/10/21 08:15:26.492 +00:00] [WARN] [rootcoord/quota_center.go:742] ["failed to get collection rate limit config"] [collectionID=444885560750714875] [error="collection=444885560750714875: collection not found"] [errorVerbose="collection=444885560750714875: collection not found\n(1) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/pkg/util/merr.wrapWithField\n | \t/go/src/github.com/milvus-io/milvus/pkg/util/merr/utils.go:531\n | github.com/milvus-io/milvus/pkg/util/merr.WrapErrCollectionNotFound\n | \t/go/src/github.com/milvus-io/milvus/pkg/util/merr/utils.go:242\n | github.com/milvus-io/milvus/internal/rootcoord.(MetaTable).getLatestCollectionByIDInternal\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/meta_table.go:503\n | github.com/milvus-io/milvus/internal/rootcoord.(MetaTable).getCollectionByIDInternal\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/meta_table.go:511\n | github.com/milvus-io/milvus/internal/rootcoord.(MetaTable).GetCollectionByID\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/meta_table.go:595\n | github.com/milvus-io/milvus/internal/rootcoord.(QuotaCenter).getCollectionLimitConfig\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:740\n | github.com/milvus-io/milvus/internal/rootcoord.(QuotaCenter).checkDiskQuota\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:769\n | github.com/milvus-io/milvus/internal/rootcoord.(QuotaCenter).calculateWriteRates\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:431\n | github.com/milvus-io/milvus/internal/rootcoord.(QuotaCenter).calculateRates\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:680\n | github.com/milvus-io/milvus/internal/rootcoord.(QuotaCenter).run\n | \t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:149\n | runtime.goexit\n | \t/usr/local/go/src/runtime/asm_amd64.s:1598\nWraps: (2) collection=444885560750714875\nWraps: (3) collection not found\nError types: (1) withstack.withStack (2) errutil.withPrefix (3) merr.milvusError"]
Expected Behavior
No response
Steps To Reproduce
No response
Milvus Log
No response
Anything else?
No response