Closed ylgeeker closed 8 years ago
librdkafka currently holds on to each broker it has ever seen, which means it will still try to connect to an old replaced broker (old address).
This should be fixed: any learnt broker that has not been reported in the official broker list for either some time, or perhaps by a quorom of brokers, should be removed.
i meet the same question,and i just update the rkb->rkb_nodename and rkb->rkb_name in rd_kafka_broker_update method when the broker replaced. i try test this.
when init,rd_kafka_brokers_add with nodeid=-1, i find the nodeid not update always,and rd_kafka_broker_thread_main threads = 2* brokers. i think rd_kafka_broker_thread_main threads=brokers is better,and the nodeid=-1 will be updated to brokerid.
It will only be able to migrate a bootstrap broker (-1) to a proper broker handle if the hostname and port matches exactly.
i add show broker info debug in rd_kafka_broker_metadata_reply method,and then i get debug info as follow:
%7|1448371123.402|BROKER|sz_write#producer-0| 10.240.113.74:9092/bootstrap: [TEST-0]show broker info : 10.240.113.74:9092/bootstrap / -1 %7|1448371123.402|BROKER|sz_write#producer-0| 10.240.113.74:9092/bootstrap: [TEST-1]show broker info : 10.240.113.74:9092 / -1 %7|1448371123.576|BROKER|sz_write#producer-0| 10.240.113.74:9092/3: [TEST-0]show broker info : 10.240.113.74:9092/3 / 3 %7|1448371123.576|BROKER|sz_write#producer-0| 10.240.113.74:9092/3: [TEST-1]show broker info : 10.240.113.74:9092 / 3 %7|1448371133.411|BROKER|sz_write#producer-0| 10.240.113.74:9092/bootstrap: [TEST-0]show broker info : 10.240.113.74:9092/bootstrap / -1 %7|1448371133.411|BROKER|sz_write#producer-0| 10.240.113.74:9092/bootstrap: [TEST-1]show broker info : 10.240.113.74:9092 / -1 %7|1448371134.486|BROKER|sz_write#producer-0| 10.240.113.74:9092/3: [TEST-0]show broker info : 10.240.113.74:9092/3 / 3 %7|1448371134.486|BROKER|sz_write#producer-0| 10.240.113.74:9092/3: [TEST-1]show broker info : 10.240.113.74:9092 / 3
it does not migrate a bootstrap broker to a proper broker handle.
What version is this on? latest master?
Can you try the same on the 0.9.0
branch?
i use 0.8.6
The functionality of migrating a broker handle from bootstrap to proper is only available on master branch Den 25 nov 2015 03:56 skrev "Chen" notifications@github.com:
i use 0.8.6
— Reply to this email directly or view it on GitHub https://github.com/edenhill/librdkafka/issues/423#issuecomment-159472462 .
use master branch,it will be migrated a bootstrap broker to a proper broker handle. but when broker replaced,the hostname not update,the invalid hostname will keep sometime?
librdkafka currently wont ever forget about a broker, so if a broker is decommissioned it will still try to connect to it infintely.
the new broker,it will be connected?
in log,server find the new broker,but not to connect.
librdkafka will periodically poll broker metadata from connected brokers, that metadata includes a list of all brokers in the cluster. So if you add new brokers to an existing cluster and librdkafka is connected to at least one existing broker it will eventually learn of the new brokers.
the phenomenon is a new kakfa-server replace a old kakfa-server user the same brokerID, after replaced,lidrdkafka can find the new kafka-server use new hostname,but librdkafka not do connect to the new hostname,when do kafka-preferred-replica-election.sh, it will more message deliver fail.
Ah, yes, so this is fixed in master. The 0.8.6 code looks up on broker id first, if the broker id is already known it will not update the hostname: https://github.com/edenhill/librdkafka/blob/0.8.6/src/rdkafka_broker.c#L4427
In master it checks if the hostname changed and if so updates it: https://github.com/edenhill/librdkafka/blob/master/src/rdkafka_broker.c#L4427
hostname not update when use master branch. it also try connect the old broker hostname.
debug log: %7|1448543340.290|METADATA|sz_write#producer-1| 10.240.113.74:9092/3: Broker #1/2: 10.231.137.162:9092 NodeId 2 %7|1448543340.290|BROKER|sz_write#producer-1| 10.231.137.162:9092 NodeID 2 %7|1448543340.290|BROKER|sz_write#producer-1| old broker 10.240.113.72:9092,new broker 10.231.137.162:9092 ....... %7|1448543341.167|CONNECT|sz_write#producer-1| 10.240.113.72:9092/2: broker in state DOWN connecting %7|1448543341.169|CONNECT|sz_write#producer-1| 10.240.113.72:9092/2: couldn't connect to ipv4#10.240.113.72:9092: Connection refused ....... %7|1448543566.162|CONNECT|sz_write#producer-1| 10.240.113.72:9092/2: couldn't connect to ipv4#10.240.113.72:9092: Connection refused
This should be fixed now.
i use zookeeper to manage kafka-cluster,and the client will watch the node of the kafka-cluster in zookeeper.
after a kafka-server was replaced by a new server, i found that :
but, the ip address "xx.xx.xx.xx:9092" is invalid, it is the repleaced host, is the old kakfa-server.
why?how should i do? please !!