uber / uReplicator

Improvement of Apache Kafka Mirrormaker
Apache License 2.0
914 stars 198 forks source link

uReplicator fails on messages with headers #193

Closed MatanShabi closed 5 years ago

MatanShabi commented 5 years ago

uReplicator master branch throws exceptions while trying to consume messages with headers (produced by producer version 1.1.0), the same uReplicator works fine with the same producer just without headers.

relevant logs:

[2018-12-12 10:13:04,882] ERROR [CompactConsumerFetcherThread-kloak-mirrormaker-test_kloakmms01-sjc1-0-1]: Error for partition [topic2,0] to broker 1:class kafka.common.UnknownException (kafka.mirrormaker.CompactConsumerFetcherThread:74)
[2018-12-12 10:13:04,882] INFO [CompactConsumerFetcherThread-kloak-mirrormaker-test_kloakmms01-sjc1-0-1]: handling partitions with error for Set(topic2-0) (kafka.mirrormaker.CompactConsumerFetcherThread:66)
[2018-12-12 10:13:04,882] INFO [CompactConsumerFetcherThread-kloak-mirrormaker-test_kloakmms01-sjc1-0-1]: enter removePartitions in CompactConsumerFetcherThread 33 for set Set(topic2-0) (kafka.mirrormaker.CompactConsumerFetcherThread:66)
[2018-12-12 10:13:04,882] INFO [CompactConsumerFetcherThread-kloak-mirrormaker-test_kloakmms01-sjc1-0-1]: Finish removePartitions in CompactConsumerFetcherThread 33 for set Set(topic2-0) (kafka.mirrormaker.CompactConsumerFetcherThread:66)
[2018-12-12 10:13:04,882] INFO [CompactConsumerFetcherManager-1544602069873] adding partitions with error Set(topic2-0) (kafka.mirrormaker.CompactConsumerFetcherManager:66)
[2018-12-12 10:13:04,902] INFO [CompactConsumerFetcherThread-kloak-mirrormaker-test_kloakmms01-sjc1-0-3]: Trying to remove lag metrics for kloak-mirrormaker-test, topic2, 2 (kafka.mirrormaker.CompactConsumerFetcherThread:66)
[2018-12-12 10:13:04,902] INFO [CompactConsumerFetcherThread-kloak-mirrormaker-test_kloakmms01-sjc1-0-3]: Trying to remove lag metrics for kloak-mirrormaker-test, topic1, 0 (kafka.mirrormaker.CompactConsumerFetcherThread:66)
[2018-12-12 10:13:04,981] INFO [kloak-mirrormaker-test_kloakmms01-sjc1-leader-finder-thread]: Partitions without leader Set(topic1-0, topic2-0, topic2-2) (kafka.mirrormaker.CompactConsumerFetcherManager$LeaderFinderThread:66)
[2018-12-12 10:13:04,982] INFO [CompactConsumerFetcherThread-kloak-mirrormaker-test_kloakmms01-sjc1-0-1]: Trying to remove lag metrics for kloak-mirrormaker-test, topic2, 0 (kafka.mirrormaker.CompactConsumerFetcherThread:66)
[2018-12-12 10:13:04,988] INFO Verifying properties (kafka.utils.VerifiableProperties:66)
[2018-12-12 10:13:04,988] INFO Property client.id is overridden to kloak-mirrormaker-test (kafka.utils.VerifiableProperties:66)
[2018-12-12 10:13:04,988] INFO Property metadata.broker.list is overridden to source_host1:9092,source_host2:9092,source_host3:9092 (kafka.utils.VerifiableProperties:66)
[2018-12-12 10:13:04,988] INFO Property request.timeout.ms is overridden to 30000 (kafka.utils.VerifiableProperties:66)
[2018-12-12 10:13:04,988] INFO Fetching metadata from broker BrokerEndPoint(2,source_host2,9092) with correlation id 1394 for 2 topic(s) Set(topic1, topic2) (kafka.client.ClientUtils$:66)
[2018-12-12 10:13:04,990] INFO Connected to source_host2:9092 for producing (kafka.producer.SyncProducer:66)
[2018-12-12 10:13:04,994] INFO Disconnecting from source_host2:9092 (kafka.producer.SyncProducer:66)
[2018-12-12 10:13:04,994] INFO [kloak-mirrormaker-test_kloakmms01-sjc1-leader-finder-thread]: Try find leader for topic: topic1, partition:0 (kafka.mirrormaker.CompactConsumerFetcherManager$LeaderFinderThread:66)
[2018-12-12 10:13:04,994] INFO [kloak-mirrormaker-test_kloakmms01-sjc1-leader-finder-thread]: Try find leader for topic: topic2, partition:2 (kafka.mirrormaker.CompactConsumerFetcherManager$LeaderFinderThread:66)
[2018-12-12 10:13:04,994] INFO [kloak-mirrormaker-test_kloakmms01-sjc1-leader-finder-thread]: Try find leader for topic: topic2, partition:0 (kafka.mirrormaker.CompactConsumerFetcherManager$LeaderFinderThread:66)
[2018-12-12 10:13:04,994] INFO [kloak-mirrormaker-test_kloakmms01-sjc1-leader-finder-thread]: Find leader for partitions finished, took: 13 ms (kafka.mirrormaker.CompactConsumerFetcherManager$LeaderFinderThread:66)
[2018-12-12 10:13:04,994] INFO [CompactConsumerFetcherManager-1544602069873] Fetcher Thread for topic partitions: ArrayBuffer([topic2-2, InitialOffset 0] , [topic1-0, InitialOffset 0] ) is CompactConsumerFetcherThread-kloak-mirrormaker-test_kloakmms01-sjc1-0-3 (kafka.mirrormaker.CompactConsumerFetcherManager:66)
[2018-12-12 10:13:04,994] INFO [CompactConsumerFetcherManager-1544602069873] Fetcher Thread for topic partitions: ArrayBuffer([topic2-0, InitialOffset 58] ) is CompactConsumerFetcherThread-kloak-mirrormaker-test_kloakmms01-sjc1-0-1 (kafka.mirrormaker.CompactConsumerFetcherManager:66)
[2018-12-12 10:13:04,994] INFO [CompactConsumerFetcherManager-1544602069873] Added fetcher for partitions ArrayBuffer([topic2-0, initOffset 58 to broker BrokerEndPoint(1,source_host1,9092)] , [topic1-0, initOffset 0 to broker BrokerEndPoint(3,source_host3,9092)] , [topic2-2, initOffset 0 to broker BrokerEndPoint(3,source_host3,9092)] ) (kafka.mirrormaker.CompactConsumerFetcherManager:66)
[2018-12-12 10:13:05,106] ERROR [CompactConsumerFetcherThread-kloak-mirrormaker-test_kloakmms01-sjc1-0-3]: Error for partition [topic1,0] to broker 3:class kafka.common.UnknownException (kafka.mirrormaker.CompactConsumerFetcherThread:74)
[2018-12-12 10:13:05,106] ERROR [CompactConsumerFetcherThread-kloak-mirrormaker-test_kloakmms01-sjc1-0-3]: Error for partition [topic2,2] to broker 3:class kafka.common.UnknownException (kafka.mirrormaker.CompactConsumerFetcherThread:74)
[2018-12-12 10:13:05,106] INFO [CompactConsumerFetcherThread-kloak-mirrormaker-test_kloakmms01-sjc1-0-3]: handling partitions with error for Set(topic1-0, topic2-2) (kafka.mirrormaker.CompactConsumerFetcherThread:66)
[2018-12-12 10:13:05,106] INFO [CompactConsumerFetcherThread-kloak-mirrormaker-test_kloakmms01-sjc1-0-3]: enter removePartitions in CompactConsumerFetcherThread 31 for set Set(topic1-0, topic2-2) (kafka.mirrormaker.CompactConsumerFetcherThread:66)
[2018-12-12 10:13:05,106] INFO [CompactConsumerFetcherThread-kloak-mirrormaker-test_kloakmms01-sjc1-0-3]: Finish removePartitions in CompactConsumerFetcherThread 31 for set Set(topic1-0, topic2-2) (kafka.mirrormaker.CompactConsumerFetcherThread:66)
[2018-12-12 10:13:05,106] INFO [CompactConsumerFetcherManager-1544602069873] adding partitions with error Set(topic1-0, topic2-2) (kafka.mirrormaker.CompactConsumerFetcherManager:66)
java -version:

openjdk version "1.8.0_131"
openjdk Runtime Environment (build 1.8.0_131-b12)
openjdk 64-Bit Server VM (build 25.131-b12, mixed mode)            
rantav commented 5 years ago

I'm seeing the same issue. The same messages without headers are OK but when headers are sent the worker kind of crashes - it stops mirroring of all topics, not just the ones that have messages with headers.

To clarify: when there's a topic that gets replicated and contains even one message with headers - uReplicator completely crashes and burns, there's no recovery from that. All replication of all other topics completely stops.

xhl1988 commented 5 years ago

Edit: previous comment is not correct.

yangy0000 commented 5 years ago

@MatanShabi @rantav I tried to mirroring data with headers but I couldn't reproduce your problem. Are you able to provide an simple reproduce step?

rantav commented 5 years ago

@MatanShabi @rantav I tried to mirroring data with headers but I couldn't reproduce your problem. Are you able to provide an simple reproduce step?

Really? OK, that's surprising. for it failed pretty consistently for me the second I started sending headers, but there's always a chance that something else was going on.

@00Sheep00 Are you saying that right now uReplicator is supposed to support messages with headers? If so then I'll try to get a simple repro (my current repro involves two data centers, two k8s clusters and other perks you're not keen to touch if you don't have to...)

yangy0000 commented 5 years ago

you saying that right now uReplicator is supposed to support messages with headers? If so then I'll try to get a simple repro (my current repro involves two data centers, two k8s clusters and other perks you're not keen to touch if you don't have to...)

no it doesn't support headers, when I produce messages with headers from source cluster , I was able to receive it on destination cluster, without headers.

rantav commented 5 years ago

no it doesn't support headers, when I produce messages with headers from source cluster , I was able to receive it on destination cluster, without headers.

I see. But it didn't completely crash on you? OK let me do some more checks it's possible that it crashed for other reasons. But it definitely consistently crashed. I moved to working on something else but I'll try to set some time aside to get a decent repro for you

yangy0000 commented 5 years ago

no it doesn't support headers, when I produce messages with headers from source cluster , I was able to receive it on destination cluster, without headers.

I see. But it didn't completely crash on you? OK let me do some more checks it's possible that it crashed for other reasons. But it definitely consistently crashed. I moved to working on something else but I'll try to set some time aside to get a decent repro for you

that's right, it didn't crash. thank you for the addition info! I will have a try on docker container. Also, May I know your kafka version, producer api version and which language your are using to produce message.

rantav commented 5 years ago

Kafka: 2.1.0 (using this docker solsson/kafka:2.1.0@sha256:ac3f06d87d45c7be727863f31e79fbfdcb9c610b51ba9cf03c75a95d602f15e1)

Producer: written in Go using github.com/confluentinc/confluent-kafka-go version 0.11.6 (which subsequently uses the C library librdkafka-dev version 0.11.6, the latest)

rantav commented 5 years ago

FYI I tried to repro the crash I'd seen before with message headers and I can't. I assume it wasn't related to the headers but I cannot repro it at all so let's assume it didn't happen and if it happens again I'll do the research ;)

yangy0000 commented 5 years ago

FYI I tried to repro the crash I'd seen before with message headers and I can't. I assume it wasn't related to the headers but I cannot repro it at all so let's assume it didn't happen and if it happens again I'll do the research ;)

thanks for follow up. FYI, we have plan to supporting header in this year.

yangy0000 commented 5 years ago

There is a known issue on kafka 1.1.0 https://issues.apache.org/jira/browse/KAFKA-6739 Please upgrade to kafka 1.1.1

MatanShabi commented 5 years ago

There is a known issue on kafka 1.1.0 https://issues.apache.org/jira/browse/KAFKA-6739 Please upgrade to kafka 1.1.1

thanks! it worked 👍 .