milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
30.26k stars 2.9k forks source link

[Bug]: Proxynode log warning "collection not found" #36314

Open iurii-stepanov opened 1 month ago

iurii-stepanov commented 1 month ago

Is there an existing issue for this?

Environment

- Milvus version: 2.4.10
- Deployment mode(standalone or cluster): cluster
- MQ type(rocksmq, pulsar or kafka): external kafka
- OS(Ubuntu or CentOS): k8s cluster

Current Behavior

Since the start of the Proxynode I constantly see messages in the logs:

{"level":"WARN","time":"2024/09/17 08:55:38.883 +00:00","caller":"proxy/task_scheduler.go:477","message":"Failed to execute task: ","traceID":"baa4d9df37948b22e765f773fb41ffdf","error":"collection not found[database=cv_product_search][collection=nlp1]"}

But I don't have collection "nlp1" and neither in this database "cv_product_search" nor in any other. It is possible that this collection has been deleted or was even previously in the database default, I am not sure.

Where is the information about available collections stored? In etcd? Why is a Proxynode trying to get a non-existent collection?

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

Proxynode:

Welcome to use Milvus!
Version:   v2.4.10
Built:     Fri Aug 30 06:25:02 UTC 2024
GitCommit: a1d3932
GoVersion: go version go1.21.11 linux/amd64

TotalMem: 8589934592
UsedMem: 26537984

open pid file: /run/milvus/proxy.pid
lock pid file: /run/milvus/proxy.pid
[2024/09/17 08:53:53.836 +00:00] [INFO] [roles/roles.go:328] ["starting running Milvus components"]
[2024/09/17 08:53:53.836 +00:00] [INFO] [roles/roles.go:177] ["Enable Jemalloc"] ["Jemalloc Path"=/milvus/lib/libjemalloc.so]
[2024/09/17 08:53:53.843 +00:00] [DEBUG] [config/refresher.go:67] ["start refreshing configurations"] [source=FileSource]
[2024/09/17 08:53:53.844 +00:00] [DEBUG] [config/etcd_source.go:52] ["init etcd source"] [etcdInfo="{\"UseEmbed\":false,\"EnableAuth\":false,\"UserName\":\"\",\"PassWord\":\"\",\"UseSSL\":false,\"Endpoints\":[\"10.96.47.11:2379\",\"10.96.47.17:2379\",\"10.96.47.27:2379\"],\"KeyPrefix\":\"crater\",\"CertFile\":\"/path/to/etcd-client.pem\",\"KeyFile\":\"/path/to/etcd-client-key.pem\",\"CaCertFile\":\"/path/to/ca.pem\",\"MinVersion\":\"1.3\",\"RefreshInterval\":5000000000}"]
[2024/09/17 08:53:53.844 +00:00] [INFO] [etcd/etcd_util.go:47] ["create etcd client"] [useEmbedEtcd=false] [useSSL=false] [endpoints="[10.96.47.11:2379,10.96.47.17:2379,10.96.47.27:2379]"] [minVersion=1.3]
[2024/09/17 08:53:53.852 +00:00] [DEBUG] [config/etcd_source.go:91] ["etcd refreshConfigurations"] [prefix=crater/config] [endpoints="[10.96.47.11:2379,10.96.47.17:2379,10.96.47.27:2379]"]
[2024/09/17 08:53:53.856 +00:00] [DEBUG] [config/refresher.go:67] ["start refreshing configurations"] [source=EtcdSource]
[2024/09/17 08:53:53.856 +00:00] [INFO] [paramtable/component_param.go:4204] ["DeployModeEnv is not set, use default"] [default=0.5]
[2024/09/17 08:53:53.857 +00:00] [INFO] [paramtable/hook_config.go:21] ["hook config"] [hook={}]
[2024/09/17 08:53:53.857 +00:00] [DEBUG] [config/refresher.go:67] ["start refreshing configurations"] [source=FileSource]
---Milvus Proxy successfully initialized and ready to serve!---
{"level":"WARN","time":"2024/09/17 08:55:38.883 +00:00","caller":"proxy/task_scheduler.go:477","message":"Failed to execute task: ","traceID":"baa4d9df37948b22e765f773fb41ffdf","error":"collection not found[database=cv_product_search][collection=nlp1]"}
{"level":"WARN","time":"2024/09/17 08:55:38.883 +00:00","caller":"proxy/impl.go:833","message":"LoadCollection failed to WaitToFinish","traceID":"baa4d9df37948b22e765f773fb41ffdf","role":"proxy","db":"cv_product_search","collection":"nlp1","refreshMode":false,"error":"collection not found[database=cv_product_search][collection=nlp1]","BeginTS":452608219897659398,"EndTS":452608219897659398}
{"level":"WARN","time":"2024/09/17 08:55:39.907 +00:00","caller":"proxy/task_scheduler.go:477","message":"Failed to execute task: ","traceID":"0b7e26dfef2e718a0d111457de2e94fa","error":"collection not found[database=cv_product_search][collection=nlp1]"}
{"level":"WARN","time":"2024/09/17 08:55:39.907 +00:00","caller":"proxy/impl.go:833","message":"LoadCollection failed to WaitToFinish","traceID":"0b7e26dfef2e718a0d111457de2e94fa","role":"proxy","db":"cv_product_search","collection":"nlp1","refreshMode":false,"error":"collection not found[database=cv_product_search][collection=nlp1]","BeginTS":452608220173172741,"EndTS":452608220173172741}
[2024/09/17 08:55:41.040 Z] [GIN] [*] [traceID=] [code=404] [latency=45.832µs] [client=10.97.47.25] [method=PRI] [error=]

Rootcoord:

{"level":"WARN","time":"2024/09/17 08:55:38.239 +00:00","caller":"rootcoord/timeticksync.go:313","message":"rootcoord send tt to all channels too slowly","chanNum":16,"span":1516}
{"level":"WARN","time":"2024/09/17 08:55:38.882 +00:00","caller":"rootcoord/root_coord.go:1179","message":"failed to describe collection","traceID":"97a973d853eaa544288cfc36e541c997","collectionName":"nlp1","dbName":"cv_product_search","id":0,"ts":18446744073709551615,"allowUnavailable":false,"error":"collection not found[database=cv_product_search][collection=nlp1]"}
{"level":"WARN","time":"2024/09/17 08:55:38.883 +00:00","caller":"rootcoord/root_coord.go:1179","message":"failed to describe collection","traceID":"baa4d9df37948b22e765f773fb41ffdf","collectionName":"nlp1","dbName":"cv_product_search","id":0,"ts":18446744073709551615,"allowUnavailable":false,"error":"collection not found[database=cv_product_search][collection=nlp1]"}
{"level":"WARN","time":"2024/09/17 08:55:39.755 +00:00","caller":"rootcoord/timeticksync.go:313","message":"rootcoord send tt to all channels too slowly","chanNum":16,"span":1515}

Anything else?

No response

yanliang567 commented 1 month ago

All the collection meta info is stored in etcd, did you have a clean etcd service for this milvus cluster? /assign @iurii-stepanov /unassign

xiaofan-luan commented 1 month ago

did you try to reuse etcd or S3 when you deploy a new cluster? or are u trying to deploy multiple milvus with same dependencies?

xiaofan-luan commented 1 month ago

did you ever created a collection named nlp1?

iurii-stepanov commented 2 weeks ago

All the collection meta info is stored in etcd, did you have a clean etcd service for this milvus cluster? /assign @iurii-stepanov /unassign

Yes. It's clean ETCD used by only one Milvus

iurii-stepanov commented 2 weeks ago

did you try to reuse etcd or S3 when you deploy a new cluster? or are u trying to deploy multiple milvus with same dependencies?

No, only one Milvus.

iurii-stepanov commented 2 weeks ago

did you ever created a collection named nlp1?

Yes. nlp1 was in another database - default.

I noticed that messages like this often appear in the logs "collection not found", but the collection is present in another database. For example, here is today's message: {"level":"WARN","time":"2024/10/09 12:30:32.994 +00:00","caller":"proxy/task_scheduler.go:469","message":"Failed to pre-execute task: collection not found[database=shopping_assistant_dev][collection=beauty_filtered_final_v1]","traceID":"beb5b3f122d4ae90e190372a6e9418c6"}

Actually a collection "beauty_filtered_final_v1" in the database default. I think it has something to do with the database switch in WebUI Attu.

xiaofan-luan commented 2 weeks ago

I think someone trying to load or search this collection. so milvus report the error. This doesn't means the system state is not correct.

on the other side, you might want to check your client see if some one is accessing nlp1 collection