milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
30.95k stars 2.95k forks source link

[Bug]: etcd start failed: max entry size limit exceeded #26855

Closed darkerin closed 1 year ago

darkerin commented 1 year ago

Is there an existing issue for this?

Environment

- Milvus version: v2.3.0
- Deployment mode(standalone or cluster): standalone
- MQ type(rocksmq, pulsar or kafka):    rocksmq
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): Ubuntu18.04
- CPU/Memory:  4C32G
- GPU: 
- Others:

Current Behavior

etcd container start faied and output error:

{"level":"info","ts":"2023-09-05T11:05:51.158Z","caller":"embed/etcd.go:371","msg":"closing etcd server","name":"default","data-dir":"/etcd","advertise-peer-urls":["http://localhost:2380"],"advertise-client-urls":["http://127.0.0.1:2379"]}
{"level":"info","ts":"2023-09-05T11:05:51.158Z","caller":"embed/etcd.go:373","msg":"closed etcd server","name":"default","data-dir":"/etcd","advertise-peer-urls":["http://localhost:2380"],"advertise-client-urls":["http://127.0.0.1:2379"]}
{"level":"fatal","ts":"2023-09-05T11:05:51.158Z","caller":"etcdmain/etcd.go:204","msg":"discovery failed","error":"wal: max entry size limit exceeded, recBytes: 256, fileSize(64000000) - offset(63999936) - padBytes(0) = entryLimit(64)","stacktrace":"go.etcd.io/etcd/server/v3/etcdmain.startEtcdOrProxyV2\n\t/tmp/etcd-release-3.5.5/etcd/release/etcd/server/etcdmain/etcd.go:204\ngo.etcd.io/etcd/server/v3/etcdmain.Main\n\t/tmp/etcd-release-3.5.5/etcd/release/etcd/server/etcdmain/main.go:40\nmain.main\n\t/tmp/etcd-release-3.5.5/etcd/release/etcd/server/main.go:32\nruntime.main\n\t/usr/local/google/home/siarkowicz/.gvm/gos/go1.16.15/src/runtime/proc.go:225"}

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

milvus.log: milvus.log

etcd etcd.log

Anything else?

No response

xiaofan-luan commented 1 year ago

seems that you need to bring up your etcd cluster first.

From https://github.com/etcd-io/etcd/issues/14025, this is bug before etcd 3.5.5

But milvus should use etcd 3.5.5

are you sharing the etcd with other service?

xiaofan-luan commented 1 year ago

And I don't think milvus could generate a 64M wal entry?

yanliang567 commented 1 year ago

@darkerin as comments above, please share more info about how you deploy milvus and any updates for configurations. /assign @darkerin /unassign

darkerin commented 1 year ago

@xiaofan-luan @yanliang567

etcd 3.5.5 container is only provided for milvus use.

Before this i try to create 500 partitions in a collection, and modified the etcd parameter ETCD_MAX_TXN_OPS see https://github.com/milvus-io/milvus/issues/26748.

When I shut down the container and tried to start it again it failed

xiaofan-luan commented 1 year ago

/assign @yanliang567 could you try to reproduce this issue on 2.3.x with 4096 partitions?

xiaofan-luan commented 1 year ago

let us try to reproduce this issue

darkerin commented 1 year ago

let us try to reproduce this issue

thanks a lot

yanliang567 commented 1 year ago

i did not reproduce this on latest 2.3.0 image.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

yhmo commented 7 months ago

Another user encountered this error with standalone(embedded etcd). etcd

There is another issue in etcd says this bug is fixed in etcd v3.5.7 https://github.com/etcd-io/etcd/issues/15090

tianshihan818 commented 7 months ago

Another user encountered this error with standalone(embedded etcd). etcd

There is another issue in etcd says this bug is fixed in etcd v3.5.7 etcd-io/etcd#15090

Hi! Did milvus support to deploy etcd v3.5.7 now?

xiaofan-luan commented 7 months ago

@LoveEachDay

AlexeyIvanov8 commented 1 week ago

Hi! I have the same issue with etcd 3.5.5-r2 installed from Milvus operator. Up question above:

Hi! Did milvus support to deploy etcd v3.5.7 now?

xiaofan-luan commented 6 days ago

we are upgrading to 3.5.14 or 3.5.15 in the latest releasse