Closed hotwinters closed 1 month ago
@hotwinters quick questions:
log.zip @yanliang567 This is a compressed RAR file that only exports a portion of the logs. I'm not sure if there is any information you need, but I noticed that during the exception period, it takes about 8-20 seconds for me to create and load a new collection with a delay
I don't have any tool to uncomress rar file.
how many collections are there existing in your cluster?
/assign @smellthemoon
could you help on it
log.zip @xiaofan-luan @smellthemoon I made it into a zip file and re uploaded it,i use stanalone,about 130 col at memory,and I delete 60 temporary collections every 20 minutes and then create another 60
log-cpu-100.zip Today I found that the CPU usage rate has been consistently maintained at 100%, and this is the latest log.
Why does the CPU continue to use 100% after having too many cols
Why does the CPU continue to use 100% after having too many cols
After restarting, milvus will recover info and load the previous data. However, due to the frequent creation of collections, there are too many tasks in the task pool on the querynode side, the latency increases, the CPU usage continues to rise, and finally milvus restarts.
Frequent creation and deletion of database tables may cause some problems and impacts. If necessary, you can use cluster to improve performance by adding machines. @hotwinters
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
Is there an existing issue for this?
Environment
Current Behavior
Every few seconds, the CPU usage of Milvus increases to 5 cores and then decreases. The SDK shows a high delay in creating and deleting libraries, so I restarted my Gosdk server and found that the delay was still there. After restarting Milvus, the functionality returned to normal
Expected Behavior
No response
Steps To Reproduce
No response
Milvus Log
[2024/07/25 06:17:40.538 +00:00] [ERROR] [retry/retry.go:46] ["retry func failed"] ["retry time"=0] [error="stack trace: /go/src/github.com/milvus-io/milvus/pkg/tracer/stack_trace.go:51 github.com/milvus-io/milvus/pkg/tracer.StackTrace\n/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:553 github.com/milvus-io/milvus/internal/util/grpcclient.(ClientBase[...]).Call\n/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:567 github.com/milvus-io/milvus/internal/util/grpcclient.(ClientBase[...]).ReCall\n/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/client/client.go:104 github.com/milvus-io/milvus/internal/distributed/datacoord/client.wrapGrpcCall[...]\n/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/client/client.go:336 github.com/milvus-io/milvus/internal/distributed/datacoord/client.(Client).GetRecoveryInfoV2\n/go/src/github.com/milvus-io/milvus/internal/querycoordv2/meta/coordinator_broker.go:145 github.com/milvus-io/milvus/internal/querycoordv2/meta.(CoordinatorBroker).GetRecoveryInfoV2\n/go/src/github.com/milvus-io/milvus/internal/querycoordv2/meta/target_manager.go:209 github.com/milvus-io/milvus/internal/querycoordv2/meta.(TargetManager).PullNextTargetV2.func1\n/go/src/github.com/milvus-io/milvus/pkg/util/retry/retry.go:44 github.com/milvus-io/milvus/pkg/util/retry.Do\n/go/src/github.com/milvus-io/milvus/internal/querycoordv2/meta/target_manager.go:238 github.com/milvus-io/milvus/internal/querycoordv2/meta.(TargetManager).PullNextTargetV2\n/go/src/github.com/milvus-io/milvus/internal/querycoordv2/meta/target_manager.go:119 github.com/milvus-io/milvus/internal/querycoordv2/meta.(TargetManager).UpdateCollectionNextTarget: rpc error: code = DeadlineExceeded desc = context deadline exceeded"] [errorVerbose="stack trace: /go/src/github.com/milvus-io/milvus/pkg/tracer/stack_trace.go:51 github.com/milvus-io/milvus/pkg/tracer.StackTrace: rpc error: code = DeadlineExceeded desc = context deadline exceeded\n(1) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/internal/util/grpcclient.(ClientBase[...]).Call\n | \t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:553\n | github.com/milvus-io/milvus/internal/util/grpcclient.(ClientBase[...]).ReCall\n | \t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:567\n | github.com/milvus-io/milvus/internal/distributed/datacoord/client.wrapGrpcCall[...]\n | \t/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/client/client.go:104\n | github.com/milvus-io/milvus/internal/distributed/datacoord/client.(Client).GetRecoveryInfoV2\n | \t/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/client/client.go:336\n | github.com/milvus-io/milvus/internal/querycoordv2/meta.(CoordinatorBroker).GetRecoveryInfoV2\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/meta/coordinator_broker.go:145\n | github.com/milvus-io/milvus/internal/querycoordv2/meta.(TargetManager).PullNextTargetV2.func1\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/meta/target_manager.go:209\n | github.com/milvus-io/milvus/pkg/util/retry.Do\n | \t/go/src/github.com/milvus-io/milvus/pkg/util/retry/retry.go:44\n | github.com/milvus-io/milvus/internal/querycoordv2/meta.(TargetManager).PullNextTargetV2\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/meta/target_manager.go:238\n | github.com/milvus-io/milvus/internal/querycoordv2/meta.(TargetManager).UpdateCollectionNextTarget\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/meta/target_manager.go:119\n | github.com/milvus-io/milvus/internal/querycoordv2/observers.(TargetObserver).updateNextTarget\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/observers/target_observer.go:281\n | github.com/milvus-io/milvus/internal/querycoordv2/observers.(TargetObserver).check\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/observers/target_observer.go:201\n | github.com/milvus-io/milvus/internal/querycoordv2/observers.(taskDispatcher[...]).schedule.func1.1\n | \t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/observers/task_dispatcher.go:101\n | github.com/milvus-io/milvus/pkg/util/conc.(Pool[...]).Submit.func1\n | \t/go/src/github.com/milvus-io/milvus/pkg/util/conc/pool.go:81\n | github.com/panjf2000/ants/v2.(goWorker).run.func1\n | \t/go/pkg/mod/github.com/panjf2000/ants/v2@v2.7.2/worker.go:67\n | runtime.goexit\n | \t/usr/local/go/src/runtime/asm_amd64.s:1598\nWraps: (2) stack trace: /go/src/github.com/milvus-io/milvus/pkg/tracer/stack_trace.go:51 github.com/milvus-io/milvus/pkg/tracer.StackTrace\n | /go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:553 github.com/milvus-io/milvus/internal/util/grpcclient.(ClientBase[...]).Call\n | /go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:567 github.com/milvus-io/milvus/internal/util/grpcclient.(ClientBase[...]).ReCall\n | /go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/client/client.go:104 github.com/milvus-io/milvus/internal/distributed/datacoord/client.wrapGrpcCall[...]\n | /go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/client/client.go:336 github.com/milvus-io/milvus/internal/distributed/datacoord/client.(Client).GetRecoveryInfoV2\n | /go/src/github.com/milvus-io/milvus/internal/querycoordv2/meta/coordinator_broker.go:145 github.com/milvus-io/milvus/internal/querycoordv2/meta.(CoordinatorBroker).GetRecoveryInfoV2\n | /go/src/github.com/milvus-io/milvus/internal/querycoordv2/meta/target_manager.go:209 github.com/milvus-io/milvus/internal/querycoordv2/meta.(TargetManager).PullNextTargetV2.func1\n | /go/src/github.com/milvus-io/milvus/pkg/util/retry/retry.go:44 github.com/milvus-io/milvus/pkg/util/retry.Do\n | /go/src/github.com/milvus-io/milvus/internal/querycoordv2/meta/target_manager.go:238 github.com/milvus-io/milvus/internal/querycoordv2/meta.(TargetManager).PullNextTargetV2\n | /go/src/github.com/milvus-io/milvus/internal/querycoordv2/meta/target_manager.go:119 github.com/milvus-io/milvus/internal/querycoordv2/meta.(TargetManager).UpdateCollectionNextTarget\nWraps: (3) rpc error: code = DeadlineExceeded desc = context deadline exceeded\nError types: (1) withstack.withStack (2) errutil.withPrefix (3) status.Error"] [stack="github.com/milvus-io/milvus/pkg/util/retry.Do\n\t/go/src/github.com/milvus-io/milvus/pkg/util/retry/retry.go:46\ngithub.com/milvus-io/milvus/internal/querycoordv2/meta.(TargetManager).PullNextTargetV2\n\t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/meta/target_manager.go:238\ngithub.com/milvus-io/milvus/internal/querycoordv2/meta.(TargetManager).UpdateCollectionNextTarget\n\t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/meta/target_manager.go:119\ngithub.com/milvus-io/milvus/internal/querycoordv2/observers.(TargetObserver).updateNextTarget\n\t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/observers/target_observer.go:281\ngithub.com/milvus-io/milvus/internal/querycoordv2/observers.(TargetObserver).check\n\t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/observers/target_observer.go:201\ngithub.com/milvus-io/milvus/internal/querycoordv2/observers.(taskDispatcher[...]).schedule.func1.1\n\t/go/src/github.com/milvus-io/milvus/internal/querycoordv2/observers/task_dispatcher.go:101\ngithub.com/milvus-io/milvus/pkg/util/conc.(Pool[...]).Submit.func1\n\t/go/src/github.com/milvus-io/milvus/pkg/util/conc/pool.go:81\ngithub.com/panjf2000/ants/v2.(goWorker).run.func1\n\t/go/pkg/mod/github.com/panjf2000/ants/v2@v2.7.2/worker.go:67"]
Anything else?
No response