Closed C-rawler closed 6 months ago
@C-rawler we need the full milvus logs for investigation, could ou please refer this doc to export the whole Milvus logs ? /assign @C-rawler /unassign
@yanliang567 Please tell me how to upload the complete milvus logs. Now there will be frequent exits.
@yanliang567 Please tell me how to upload the complete milvus logs. Now there will be frequent exits.
please try some cloud drive for sharing?
@yanliang567 Please tell me how to upload the complete milvus logs. Now there will be frequent exits.
please try some cloud drive for sharing?
Can you download it from Baidu Cloud?
@yanliang567 Please tell me how to upload the complete milvus logs. Now there will be frequent exits.
please try some cloud drive for sharing?
https://drive.google.com/file/d/1ak0_DBjY9LkNeVZ2nHhU4msjzIvfHkZz/view?usp=drive_link
@yanliang567 Please tell me how to upload the complete milvus logs. Now there will be frequent exits.
please try some cloud drive for sharing?
https://drive.google.com/file/d/1ak0_DBjY9LkNeVZ2nHhU4msjzIvfHkZz/view?usp=drive_link
please share the key for downloading the logs
@yanliang567 Please tell me how to upload the complete milvus logs. Now there will be frequent exits.
please try some cloud drive for sharing?
https://drive.google.com/file/d/1ak0_DBjY9LkNeVZ2nHhU4msjzIvfHkZz/view?usp=drive_link
please share the key for downloading the logs
Sorry, please try this link again. https://drive.google.com/file/d/1ak0_DBjY9LkNeVZ2nHhU4msjzIvfHkZz/view?usp=sharing
@C-rawler i can you are starting milvus from some existing collections and partitions. Moreover, these collections and partitions were loaded, so milvus is trying to load them after started. It suddenly fails to load the segments after loading about 888MB data... For you solution, i think we could try to add more memory to the milvus pod/container. For milvus, we have improved the memory predict in milvus 2.3.8, please retry on it if possible.
[collectionID=444925537510994479] [maxSegmentSize=416.8495330810547] [concurrency=1] [committedMemSize=703.9019622802734] [memUsage=888.7808685302734] [committedDiskSize=0] [diskUsage=0] [predictMemUsage=1305.6304016113281] [predictDiskUsage=0] [mmapEnabled=false]
[32mmilvus-standalone |[0m [2024/02/06 14:00:17.152 +00:00] [INFO] [segments/segment_loader.go:400] ["request resource for loading segments (unit in MiB)"] [traceID=6064a8fde4ae287d4e6b60a3cb78fa01] [segmentIDs="[446469289939867624]"] [workerNum=190] [committedWorkerNum=553] [memory=416.8495330810547] [committedMemory=1120.7514953613281] [disk=0] [committedDisk=0]
[32mmilvus-standalone |[0m [2024/02/06 14:00:17.152 +00:00] [INFO] [segments/segment_loader.go:186] ["start loading..."] [traceID=676dea72830c04acd4fcdc4a91f25e5a] [collectionID=444925537510994490] [segmentType=Sealed] [segmentNum=1] [afterFilter=1]
[32mmilvus-standalone |[0m [2024/02/06 14:00:17.152 +00:00] [INFO] [segments/segment.go:182] ["create segment"] [collectionID=444925537510994479] [partitionID=444925537510994480] [segmentID=446469289939867624] [segmentType=Sealed]
[32mmilvus-standalone |[0m [2024/02/06 14:00:17.152 +00:00] [INFO] [segments/segment_loader.go:258] ["start to load segments in parallel"] [traceID=6064a8fde4ae287d4e6b60a3cb78fa01] [collectionID=444925537510994479] [segmentType=Sealed] [segmentNum=1] [concurrencyLevel=1]
[32mmilvus-standalone |[0m [2024/02/06 14:00:17.152 +00:00] [INFO] [segments/segment_loader.go:536] ["start loading segment files"] [traceID=6064a8fde4ae287d4e6b60a3cb78fa01] [collectionID=444925537510994479] [partitionID=444925537510994480] [shard=by-dev-rootcoord-dml_1_444925537510994479v0] [segmentID=446469289939867624] [rowNum=113963] [segmentType=Sealed]
[32mmilvus-standalone |[0m [2024/02/06 14:00:17.152 +00:00] [INFO] [segments/segment_loader.go:578] ["load fields..."] [traceID=6064a8fde4ae287d4e6b60a3cb78fa01] [collectionID=444925537510994479] [partitionID=444925537510994480] [shard=by-dev-rootcoord-dml_1_444925537510994479v0] [segmentID=446469289939867624] [indexedFields="[111]"]
[32mmilvus-standalone |[0m [2024/02/06 14:00:17.154 +00:00] [INFO] [segments/segment_loader.go:186] ["start loading..."] [traceID=4be7623495bd8f0653b6ce8e766c3521] [collectionID=444925537510994485] [segmentType=Sealed] [segmentNum=1] [afterFilter=1]
[32mmilvus-standalone |[0m [2024/02/06 14:00:17.154 +00:00] [WARN] [delegator/delegator_data.go:392] ["worker failed to load segments"] [traceID=676dea72830c04acd4fcdc4a91f25e5a]
@C-rawler i can you are starting milvus from some existing collections and partitions. Moreover, these collections and partitions were loaded, so milvus is trying to load them after started. It suddenly fails to load the segments after loading about 888MB data... For you solution, i think we could try to add more memory to the milvus pod/container. For milvus, we have improved the memory predict in milvus 2.3.8, please retry on it if possible.
[collectionID=444925537510994479] [maxSegmentSize=416.8495330810547] [concurrency=1] [committedMemSize=703.9019622802734] [memUsage=888.7808685302734] [committedDiskSize=0] [diskUsage=0] [predictMemUsage=1305.6304016113281] [predictDiskUsage=0] [mmapEnabled=false] �[32mmilvus-standalone |�[0m [2024/02/06 14:00:17.152 +00:00] [INFO] [segments/segment_loader.go:400] ["request resource for loading segments (unit in MiB)"] [traceID=6064a8fde4ae287d4e6b60a3cb78fa01] [segmentIDs="[446469289939867624]"] [workerNum=190] [committedWorkerNum=553] [memory=416.8495330810547] [committedMemory=1120.7514953613281] [disk=0] [committedDisk=0] �[32mmilvus-standalone |�[0m [2024/02/06 14:00:17.152 +00:00] [INFO] [segments/segment_loader.go:186] ["start loading..."] [traceID=676dea72830c04acd4fcdc4a91f25e5a] [collectionID=444925537510994490] [segmentType=Sealed] [segmentNum=1] [afterFilter=1] �[32mmilvus-standalone |�[0m [2024/02/06 14:00:17.152 +00:00] [INFO] [segments/segment.go:182] ["create segment"] [collectionID=444925537510994479] [partitionID=444925537510994480] [segmentID=446469289939867624] [segmentType=Sealed] �[32mmilvus-standalone |�[0m [2024/02/06 14:00:17.152 +00:00] [INFO] [segments/segment_loader.go:258] ["start to load segments in parallel"] [traceID=6064a8fde4ae287d4e6b60a3cb78fa01] [collectionID=444925537510994479] [segmentType=Sealed] [segmentNum=1] [concurrencyLevel=1] �[32mmilvus-standalone |�[0m [2024/02/06 14:00:17.152 +00:00] [INFO] [segments/segment_loader.go:536] ["start loading segment files"] [traceID=6064a8fde4ae287d4e6b60a3cb78fa01] [collectionID=444925537510994479] [partitionID=444925537510994480] [shard=by-dev-rootcoord-dml_1_444925537510994479v0] [segmentID=446469289939867624] [rowNum=113963] [segmentType=Sealed] �[32mmilvus-standalone |�[0m [2024/02/06 14:00:17.152 +00:00] [INFO] [segments/segment_loader.go:578] ["load fields..."] [traceID=6064a8fde4ae287d4e6b60a3cb78fa01] [collectionID=444925537510994479] [partitionID=444925537510994480] [shard=by-dev-rootcoord-dml_1_444925537510994479v0] [segmentID=446469289939867624] [indexedFields="[111]"] �[32mmilvus-standalone |�[0m [2024/02/06 14:00:17.154 +00:00] [INFO] [segments/segment_loader.go:186] ["start loading..."] [traceID=4be7623495bd8f0653b6ce8e766c3521] [collectionID=444925537510994485] [segmentType=Sealed] [segmentNum=1] [afterFilter=1] �[32mmilvus-standalone |�[0m [2024/02/06 14:00:17.154 +00:00] [WARN] [delegator/delegator_data.go:392] ["worker failed to load segments"] [traceID=676dea72830c04acd4fcdc4a91f25e5a]
@yanliang567 Thank you for your help, so I can understand that it is a memory problem. How can I add more memory to the milvus pod/container, or can I update to milvus 2.3.8 to ensure that the previous data is loaded normally?
Milvus 2.3.8 fixed issues about memory prediction, so it is not helpful if you have more data than the memory could hold. suppose you are running docker compose on mac, you need to add more cpu or memory on docker manager/
Milvus 2.3.8 fixed issues about memory prediction, so it is not helpful if you have more data than the memory could hold. suppose you are running docker compose on mac, you need to add more cpu or memory on docker manager/
I am running docker-compose on Linux. How do I need to modify the amount of memory? I have not found the relevant configuration.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
then you just need more memory for the linux
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
Is there an existing issue for this?
Environment
Current Behavior
This error occurred shortly after I started the milvus database.
Expected Behavior
No response
Steps To Reproduce
No response
Milvus Log
[[32mmilvus-standalone |^[[0m [2024/02/06 06:34:17.413 +00:00] [DEBUG] [client/client.go:96] ["RootCoordClient GetSessions success"] [address=172.20.0.4:53100] [serverID=17]^[[32mmilvus-standalone |^[[0m [2024/02/06 06:34:17.413 +00:00] [ERROR] [grpcclient/client.go:405] ["retry func failed"] ["retry time"=0] [error="rpc error: code = Unknown desc = expectedNodeID=16, actualNodeID=17: node not match"] [stack="github.com/milvus-io/milvus/internal/util/grpcclient.(ClientBase[...]).call\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:405\ngithub.com/milvus-io/milvus/internal/util/grpcclient.(ClientBase[...]).Call\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:483\ngithub.com/milvus-io/milvus/internal/util/grpcclient.(ClientBase[...]).ReCall\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:499\ngithub.com/milvus-io/milvus/internal/distributed/rootcoord/client.wrapGrpcCall[...]\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/rootcoord/client/client.go:120\ngithub.com/milvus-io/milvus/internal/distributed/rootcoord/client.(Client).DescribeCollection\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/rootcoord/client/client.go:196\ngithub.com/milvus-io/milvus/internal/datacoord.(CoordinatorBroker).HasCollection\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/coordinator_broker.go:135\ngithub.com/milvus-io/milvus/internal/datacoord.(ServerHandler).HasCollection.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/handler.go:372\ngithub.com/milvus-io/milvus/pkg/util/retry.Do\n\t/go/src/github.com/milvus-io/milvus/pkg/util/retry/retry.go:40\ngithub.com/milvus-io/milvus/internal/datacoord.(ServerHandler).HasCollection\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/handler.go:371\ngithub.com/milvus-io/milvus/internal/datacoord.(ServerHandler).CheckShouldDropChannel\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/handler.go:411\ngithub.com/milvus-io/milvus/internal/datacoord.(ChannelManager).unwatchDroppedChannels\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/channel_manager.go:254\ngithub.com/milvus-io/milvus/internal/datacoord.(ChannelManager).Startup\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/channel_manager.go:164\ngithub.com/milvus-io/milvus/internal/datacoord.(Cluster).Startup\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/cluster.go:57\ngithub.com/milvus-io/milvus/internal/datacoord.(Server).initServiceDiscovery\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/server.go:490\ngithub.com/milvus-io/milvus/internal/datacoord.(Server).initDataCoord\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/server.go:347\ngithub.com/milvus-io/milvus/internal/datacoord.(Server).Init\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/server.go:316\ngithub.com/milvus-io/milvus/internal/distributed/datacoord.(Server).init\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/service.go:108\ngithub.com/milvus-io/milvus/internal/distributed/datacoord.(Server).Run\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/service.go:229\ngithub.com/milvus-io/milvus/cmd/components.(*DataCoord).Run\n\t/go/src/github.com/milvus-io/milvus/cmd/components/data_coord.go:49\ngithub.com/milvus-io/milvus/cmd/roles.runComponent[...].func1\n\t/go/src/github.com/milvus-io/milvus/cmd/roles/roles.go:112"]
Anything else?
No response