Closed skyao closed 2 years ago
First highlight for this bug:
Sometimes the certification test will be passed, this is just for good luck and depends on what app is sending to before we stop app-2:
I will add some log to show this.
And there is a question about this behavior: if there are two apps subscribed the same topic of kafka with same comsumer group, the events should only be sent to one of these two apps? From the test I found that only when this app is down, the events will sent to another app.
The second highlight for this bug: this bug is not exist in kafka pubsub component!
I have tested and verified that with kafka pubsub component, if the events is be sent to app-2 and app-2 is stopped, kafka comsumer rebalance will work and app-3 will start to receive the event, and this will happens in seconds after app-2 (in fact, dapr sidecar of app-2) is stopped.
This is why the certification test of kafka pubsub component is OK.
Update: I saw a value of 300 later, so this is not fixed to 301, but about 300....
And it is very tricky that the number of items remaining is always be 301 (which I can't understand):
flow.go:228: Completed step: wait
flow.go:220: Running step: assert messages(consumer rebalance)
watcher.go:390: Timed out with 301 items remaining
flow.go:228: Completed step: assert messages(consumer rebalance)
I am not a Kafka expert, but normally in Dapr pubsub messages are delivered to each instance of the application, or when that’s not possible, we set a specific consumer group.
Don’t quote on me on that, but I don’t believe there’s other instances when we rebalance from “app-2” to “app-3”. I don’t know if we even want that?
I am not a Kafka expert, but normally in Dapr pubsub messages are delivered to each instance of the application, or when that’s not possible, we set a specific consumer group.
Don’t quote on me on that, but I don’t believe there’s other instances when we rebalance from “app-2” to “app-3”. I don’t know if we even want that?
I couldn't find the root cause for a long time, until I finally added the log and saw the actual log content.
This is some of the log after app-2 is stopped, no events are sent to app-3:
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 464
time="2022-06-20T15:45:56.68836827Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 465
time="2022-06-20T15:45:56.788676919Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 466
time="2022-06-20T15:45:56.888037889Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 467
time="2022-06-20T15:45:56.98805577Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 468
time="2022-06-20T15:45:57.087909192Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 469
time="2022-06-20T15:45:57.188531825Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 470
time="2022-06-20T15:45:57.288259493Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 471
time="2022-06-20T15:45:57.388276102Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 472
time="2022-06-20T15:45:57.488084206Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 473
time="2022-06-20T15:45:57.588489451Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/simulate.go:34: Simulating error 27
time="2022-06-20T15:45:57.58953958Z" level=error msg="Error processing Kafka message: neworder/1/1473 [key=dGVzdA==]. Error: fails to send binding event to http app channel, status code: 500 body: simulated error\n. Retrying..." instance=skyserver scope=dapr.components type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 474
time="2022-06-20T15:45:57.640810192Z" level=info msg="Successfully processed Kafka message after it previously failed: neworder/1/1473 [key=dGVzdA==]" instance=skyserver scope=dapr.components type=log ver=unknown
time="2022-06-20T15:45:57.688217251Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 475
time="2022-06-20T15:45:57.787850135Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 476
time="2022-06-20T15:45:57.887802611Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 477
time="2022-06-20T15:45:57.988724409Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 478
time="2022-06-20T15:45:58.088478848Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 479
time="2022-06-20T15:45:58.187809885Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 480
time="2022-06-20T15:45:58.288633017Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 481
time="2022-06-20T15:45:58.388047425Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 482
time="2022-06-20T15:45:58.487964783Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 483
time="2022-06-20T15:45:58.588153951Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 484
time="2022-06-20T15:45:58.687729645Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 485
time="2022-06-20T15:45:58.78873586Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 486
time="2022-06-20T15:45:58.888542868Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 487
time="2022-06-20T15:45:58.988532627Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 488
time="2022-06-20T15:45:59.087963296Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 489
time="2022-06-20T15:45:59.187947176Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 490
time="2022-06-20T15:45:59.288255927Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 491
time="2022-06-20T15:45:59.388282484Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 492
time="2022-06-20T15:45:59.487517448Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 493
time="2022-06-20T15:45:59.588655314Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 494
time="2022-06-20T15:45:59.6887103Z" level=info msg="gRPC API Called: /dapr.proto.runtime.v1.Dapr/InvokeBinding" instance=skyserver scope=dapr.runtime.grpc.api-info type=log ver=unknown
/home/sky/work/code/dapr-fork/components-contrib/tests/certification/bindings/kafka/kafka_retry_test.go:88: ======== app-1 received event: Background message - 495
The cause of this bug is found and should be fixed by PR https://github.com/dapr/components-contrib/pull/1804
This bug is fixed by above PR.
Close it.
Expected Behavior
In our certification test, we have a consumer rebalance case for kafak (both in kafak pubsub component and kafka binding component). The test logic is:
Actual Behavior
After app-2 stopped, there's no event sent to app-3.
So the certification test failed for this: there are some events not received.
Steps to Reproduce the Problem
Our certification test for kakfa binding component failed for this issue, see:
https://github.com/dapr/components-contrib/runs/6958628713?check_suite_focus=true https://github.com/dapr/components-contrib/runs/6965196737?check_suite_focus=true
Release Note