Closed jhi closed 3 years ago
Note that while the demo program tries the Consume()
after the Produce()
only once, and fails due to the polling timeout, the original code I extracted this from did try consuming in a loop, so there were plenty of retries, up until the 10 minutes timeout of go test
(originally this was an attempt at a unit test). So retrying is not the answer.
I also just now tested having a single six minute sleep after the produce. Nope, doesn't help, the consume after still times out.
More experimentation: I decided to check that the Produce()
side is not broken and modfied the code to test with direct Kafka (sarama) producer code instead of pixy Produce()
. Found no problem in that, the producing via the pixy was working fine (also verified the result with kafkacat
).
I ended up doing the full matrix:
producer=P topic=C group=C OK
producer=P topic=G group=C FAIL
producer=P topic=C group=G FAIL
producer=P topic=G group=G FAIL
producer=K topic=C group=C OK
producer=K topic=G group=C FAIL
producer=K topic=C group=G FAIL
producer=K topic=G group=G FAIL
Legend: producer P(ixy) or K(afka), topic C(onstant) or G(enerated), group C(onstant) or G(enerated).
Conclusion: if either the topic name or the consumer group name are freshly generated, the code fails. If both the topic and the group already exist, the code succeeds.
Success meaning that the testing sequence:
successfully first fetches the initial offset, and then consumes+acks the messages.
Failure meaning that the (initial) ConsNAck()
never seems to work, always returning the long poll timeout, even though there definitely is content available in the topic. So it's more complex than my original conclusion that's it's only about the group being freshly generated.
(And producing found to be faultless.)
It is not crazy and not a bug it is an expected behaviour. it is describe here: https://github.com/mailgun/kafka-pixy/blob/master/quick-start-curl.md.
The attached file is a result of
git archive
from my demo application demonstrating the issue.It's a little bigger than I anticipated since it contains all the up-to-date vendor dependencies, sorry about that.
The demo code itself is short, in
cmd/pixy-issue/main.go
. It could be even shorter but I tried to be as tidy and explicit as possible.The bug in short: if a consumer group has been just created, it seems that one "warm-up"
ConsumeNAck()
call is necessary before the firstProduce()
, or otherwise anyConsumeNAck()
calls following will fail due to the polling timeout. Yeah, sounds crazy, I know.But it really seems that the dummy
ConsumeNAck
(which naturally itself fails due to timeout, since there is nothing yet produced) is needed to do ... something ... maybe it is needed for registering the consumer group properly (I am just waving my arms here).make test
to build and run the tests. The kafka is assumed to be running inlocalhost:9092
and the proxy inlocalhost:19091
, command line flags are available.The contained
README.md
gives more details.pixy-issue.tar.gz