knative / eventing

Event-driven application platform for Kubernetes
https://knative.dev/docs/eventing
Apache License 2.0
1.41k stars 588 forks source link

Leader election prevents scaling of IMC dispatcher, causes dropped events #4638

Closed antoineco closed 3 years ago

antoineco commented 3 years ago

Describe the bug

When I scale IMC dispatchers, only 1 instance seems to accept events for a given channel.

I have currently 5 replicas, send events to the channel's URL using Vegeta, and see the following report:

Requests      [total, rate, throughput]         1440000, 8000.01, 1590.51
Duration      [total, attack, wait]             3m0s, 3m0s, 514.117µs
Latencies     [min, mean, 50, 90, 95, 99, max]  238.272µs, 681.449µs, 471.749µs, 1.105ms, 1.775ms, 3.64ms, 51.477ms
Bytes In      [total, mean]                     0, 0.00
Bytes Out     [total, mean]                     2949120000, 2048.00
Success       [ratio]                           19.88%
Status Codes  [code:count]                      202:286292  500:1153708
Error Set:
500 Internal Server Error

Only ~20% (100% / 5 replicas) of events are accepted.

With 1 replica, the success ratio 100% given a node with sufficient capacity.

Expected behavior

Leader election doesn't affect event deliveries, and all dispatcher replicas are capable of handling events for any Channel.

To Reproduce

$ kubectl -n knative-eventing scale deployment imc-dispatcher --replicas 5

Send events, notice regular 500 responses.

Knative release version

Eventing v0.19.2

Additional context

antoineco commented 3 years ago

@mattmoor pointed me to the implementation in Kourier: https://github.com/knative-sandbox/net-kourier/blob/105c052d845634e88462d76e2ae384a95ffaba62/pkg/reconciler/ingress/ingress.go#L81

/assign