Open gousteris opened 2 years ago
Note there are two failures in that CI link. @gousteris is there another issue for the coproc failure in that CI link?
@mmaslankaprv do I remember correctly that you recently added some heuristics where if leadership wasn't available you'd return leader from health manager or something?
@dotnwat I can only see coproc_fixture_rpunit
in addition to the raft one
i think this may be related with: #3486
Filed #3491 to track the coproc test failure
This looks different to #3486, because the leadership is not reading as a previous value, it's an unexpected new value.
Client is reading the leader as node 3, but node 3 is not a leader and never has been for any partition.
Client:
[DEBUG - 2022-01-16 07:27:32,893 - kafka_cat - _cmd_raw - lineno:54]: {"originating_broker":{"id":2,"name":"docker_n_17:9092/2"},"query":{"topic":"*"},"controllerid":2,"brokers":[{"id":3,"name":"docker_n_38:9092"},{"id":2,"name":"docker_n_17:9092"},{"id":1,"name":"docker_n_16:9092"}],"topics":[{"topic":"topic-ghcmgkqdld","partitions":[{"partition":0,"leader":3,"replicas":[{"id":1},{"id":2},{"id":3}],"isrs":[{"id":1},{"id":2},{"id":3}]}]}]}
Server:
TRACE 2022-01-16 07:27:32,892 [shard 0] kafka - request_context.h:146 - [172.18.0.51:47574] sending 3:metadata response {throttle_time_ms=0 brokers={{node_id=3 host=docker_n_38 port=9092 rack={nullopt}}, {node_id=2 host=docker_n_17 port=9092 rack={nullopt}}, {node_id=1 host=docker_n_16 port=9092 rack={nullopt}}} cluster_id={nullopt} controller_id=2 topics={{error_code={ error_code: none [0] } name={topic-ghcmgkqdld} is_internal=false partitions={{error_code={ error_code: none [0] } partition_index=0 leader_id=3 leader_epoch=-1 replica_nodes={{1}, {2}, {3}} isr_nodes={{1}, {2}, {3}} offline_replicas={}}} topic_authorized_operations=0}} cluster_authorized_operations=-2147483648}
# ^ that's node 2, which *is* the leader, telling the client that node 3 is the leader.
Not year clear how that can possibly happen -- perhaps we have some place where node ID is being used to index into an array and the nodes aren't in the order expected.
Hmm, so the most recent update to the partition table is a null leader, as in #3486, but somehow that's showing up in a metadata response as "3" rather than -1.
TRACE 2022-01-16 07:27:32,061 [shard 1] cluster - partition_leaders_table.cc:113 - updated partition: {kafka/topic-ghcmgkqdld/0} leader: {term: 1, current leader: {nullopt}, previous leader: {2}}
TRACE 2022-01-16 07:27:32,892 [shard 0] kafka - request_context.h:146 - [172.18.0.51:47574] sending 3:metadata response {throttle_time_ms=0 brokers={{node_id=3 host=docker_n_38 port=9092 rack={nullopt}}, {node_id=2 host=docker_n_17 port=9092 rack={nullopt}}, {node_id=1 host=docker_n_16 port=9092 rack={nullopt}}} cluster_id={nullopt} controller_id=2 topics={} cluster_authorized_operations=-2147483648}
This probably is fixed by the fix for #3486, but there's still a mystery (at least to me) of where that 3 is coming from, which might be another bug in its own right.
Removing ci-failure
because I'm fairly sure the conditions that reproduced this are fixed with the fix for #3486, but keeping ticket open for any comments on how we could have seen leader=3 in these conditions: @mmaslankaprv @dotnwat can you guys think of any mechanisms?
Seen again https://buildkite.com/vectorized/redpanda/builds/6240#4d23da5e-9696-4f1e-875b-df7298790740 along with cluster_config_test.ClusterConfigTest.test_restart
ref: https://buildkite.com/vectorized/redpanda/builds/6129#cd2aede7-c30f-4777-829b-beeb64247ca2
JIRA Link: CORE-815