Open tmodak27 opened 1 week ago
@lec-bit Did you figure out it?
I got the same errors after 16k services, We designed it based on 5000 services and 10w pod. https://github.com/kmesh-net/kmesh/issues/318#issuecomment-2114550669
What error do we first met?
What error do we first met?
We performed a load test using pilot-load to verify maximum number of pods.
invalid next size
error.nodeMetadata: {}
jitter:
workloads: "110ms"
config: "0s"
namespaces:
- name: foo
replicas: 1
applications:
- name: foo
replicas: 1
instances: 400
nodes:
- name: node
count: 5
kubectl create configmap config-400-pod -n pilot-load --from-file=config.yaml=svc.yaml --dry-run=client -oyaml | kubectl apply -f -
volumes.name.configMap.name
to config-400-pod
and then kubectl apply -f load-deployment.yaml
. It will take 1 to 2 minutes for all the mock pods to get deployed.nodeMetadata: {}
jitter:
workloads: "110ms"
config: "0s"
namespaces:
- name: foo
replicas: 1
applications:
- name: foo
replicas: 1
instances: 100
nodes:
- name: node
count: 3
Edit: Both the above tests were replicated multiple times by changing the deployment order; ie deploying the mock pods first and then Kmesh second. Here is what we observed:
malloc(): invalid next size (unsorted)
, but in a few cases we also get malloc(): mismatching next->prev_size (unsorted)
@nlgwcy @hzxuzhonghu This seems like a critical bug, can you take some time to look into the root cause
Motivation: Our production use case requires support for a very large number of services and instances
1. What we did
Environment Details:
We started scaling up in batches of 1000 services using the below yaml file and command.
$ for i in $(seq 1 1000); do sed "s/foo-service/foo-service-0-$(date +%s-%N)/g" svc.yaml | kubectl apply -f -; done`