milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
30.23k stars 2.9k forks source link

[Bug]: Milvus start failed when trying to remove file and directory #37311

Open chyezh opened 2 hours ago

chyezh commented 2 hours ago

Is there an existing issue for this?

Environment

- Milvus version: master-3a3404658e849bd275cb35abeddfb008aba5c183
- Deployment mode(standalone or cluster):
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

[2024/10/30 18:04:12.891 +08:00] [WARN] [segments/cgo_util.go:86] ["CStatus returns err"] [error="boost::filesystem::file_size: No such file or directory [system:2]: \"/var/lib/milvus/data/indexnode/text_log/453580601963416497/0/453580601963416496/103/c9dbb3c27f2b40debfcaafc2e33ef45e.store\""] [extra="get local used size failed"]
[2024/10/30 18:04:12.891 +08:00] [WARN] [querynodev2/server.go:318] ["get local used size failed"] [error="boost::filesystem::file_size: No such file or directory [system:2]: \"/var/lib/milvus/data/indexnode/text_log/453580601963416497/0/453580601963416496/103/c9dbb3c27f2b40debfcaafc2e33ef45e.store\""]
[2024/10/30 18:04:12.891 +08:00] [ERROR] [querynode/service.go:144] ["QueryNode init error: "] [error="boost::filesystem::file_size: No such file or directory [system:2]: \"/var/lib/milvus/data/indexnode/text_log/453580601963416497/0/453580601963416496/103/c9dbb3c27f2b40debfcaafc2e33ef45e.store\""] [stack="github.com/milvus-io/milvus/internal/distributed/querynode.(*Server).init\n\t/home/chyezh/repository/chyezh/milvus/internal/distributed/querynode/service.go:144\ngithub.com/milvus-io/milvus/internal/distributed/querynode.(*Server).Run\n\t/home/chyezh/repository/chyezh/milvus/internal/distributed/querynode/service.go:222\ngithub.com/milvus-io/milvus/cmd/components.(*QueryNode).Run\n\t/home/chyezh/repository/chyezh/milvus/cmd/components/query_node.go:59\ngithub.com/milvus-io/milvus/cmd/roles.runComponent[...].func1\n\t/home/chyezh/repository/chyezh/milvus/cmd/roles/roles.go:129"]
[2024/10/30 18:04:12.891 +08:00] [ERROR] [components/query_node.go:60] ["QueryNode starts error"] [error="boost::filesystem::file_size: No such file or directory [system:2]: \"/var/lib/milvus/data/indexnode/text_log/453580601963416497/0/453580601963416496/103/c9dbb3c27f2b40debfcaafc2e33ef45e.store\""] [stack="github.com/milvus-io/milvus/cmd/components.(*QueryNode).Run\n\t/home/chyezh/repository/chyezh/milvus/cmd/components/query_node.go:60\ngithub.com/milvus-io/milvus/cmd/roles.runComponent[...].func1\n\t/home/chy
ezh/repository/chyezh/milvus/cmd/roles/roles.go:129"]

panic: boost::filesystem::file_size: No such file or directory [system:2]: "/var/lib/milvus/data/indexnode/text_log/453580601963416497/0/453580601963416496/103/c9dbb3c27f2b40debfcaafc2e33ef45e.store"

Expected Behavior

No panic should happen.

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

chyezh commented 2 hours ago

Milvus standalone may be crash at startup if the /var/lib/milvus directory is not empty.

IndexNode or other component may clear the file at the directory, but some component try to fetch the size of the directory.

the component fetch the size of directory may report failure if the file is removed, so startup failure happens.