streamnative / kop

Kafka-on-Pulsar - A protocol handler that brings native Kafka protocol to Apache Pulsar
https://streamnative.io/docs/kop
Apache License 2.0
447 stars 132 forks source link

[BUG]Benchmark driver-kop kafka_to_kafka.yaml kafkaConfig: acks=all(-1) benchmark-worker producer and consumer clients report errors #1514

Open caihualin opened 1 year ago

caihualin commented 1 year ago

Describe the bug Benchmark driver-kop kafka_to_kafka.yaml kafkaConfig: acks=all(-1) benchmark-worker producer and consumer clients report errors.

To Reproduce

  1. Cluster nodes 3 Broker and 3 Bookie nodes are deployed together, 3 zookeeper nodes
  2. Configuration 1)Server Configuration broker.conf managedLedgerDefaultEnsembleSize=3,managedLedgerDefaultWriteQuorum=3, managedLedgerDefaultAckQuorum=2,and default values for other parameters. bookie storage.conf journalSyncData=true,journalWriteData=true,and default values for other parameters. 2)Client Configuration driver-kop kafka_to_kafka.yaml replicationFactor=3, min.insync.replicas=2, acks=all, and default values for other parameters.
  3. Test scenario send rate 10000000, 1 topic, 3 partitions, 4 producers per topic, 1 consumer per subscription, and 1 subscription. Expected behavior Can do microbenchmark test normally

Screenshots report error info 1)producer [kafka-producer-network-thread | producer-2] INFO org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-2] Node -1 disconnected. [kafka-producer-network-thread | producer-2] INFO org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-2] Cancelled in-flight METADATA request with correlation id 2 due to node -1 being disonnected (elapsed time since creation: 6ms, elapsed time since send: 6ms, request timeout: 1200000ms) [kafka-producer-network-thread | producer-2] INFO org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-2] Cancelled in-flight INIT_PRODUCER_ID request with correlation id 3 due to node -1 bing disconnected (elapsed time since creation: 5ms, elapsed time since send: 5ms, request timeout: 1200000ms) [kafka-producer-network-thread | producer-2] WARN org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-2] Bootstrap broker 172.20.140.169:9092 (id: -1 rack: null) disconnected [kafka-producer-network-thread | producer-3] INFO org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-3] Node -1 disconnected. 2) consumer [pool-1-thread-1] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-sub-000-jnnjO5Q-1, groupId=sub-000-jnnjO5Q] (Re-)joining group [pool-1-thread-1] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-sub-000-jnnjO5Q-1, groupId=sub-000-jnnjO5Q] JoinGroup failed: Coordinator 172.20.140.167:9092 (id: 548361440 rack: null) is loading the group.

Additional context Note: If the acks parameter is changed from "all" to "1" on the benchmark driver-kop kafka_to_kafka.yaml, there will be no error.

BewareMyPower commented 1 year ago

I see the same issue in https://forum.streamnative.io/t/openmessaging-benchmark-kop/484.

Could you follow my suggestion to provide debug logs?