oleksiyk / kafka

Apache Kafka 0.9 client for Node
MIT License
297 stars 85 forks source link

LeaderNotAvailable disguised as UnknownTopicOrPartition #225

Closed zvictor closed 6 years ago

zvictor commented 6 years ago

the problem

When running SimpleConsumer.subscribe in a LeaderNotAvailable scenario, a UnknownTopicOrPartition error is thrown:

KafkaError: This request is for a topic or partition that does not exist on this broker.

This can be reproduced (sometimes it works, sometimes it doesn't) running the code at https://github.com/Quadric/radiaction/tree/40d3433be9da803ab2c2207e51f4088bcb4ed069/examples/basic-example

It's important to have something done about it because such case is very hard to catch and debug. It took me days to find this error hidden inside SimpleConsumer.client.topicMetadata. Keep in mind that it is never guaranteed that the error will be there next time you run your code, given the nature of a LeaderNotAvailable issue. That's how my topicMetadata looks sometimes (some other times it's just empty):

{
  "rick-morty__BUY_SAUCE": {
    "0": {
      "error": {
        "name": "KafkaError",
        "code": "LeaderNotAvailable",
        "message": "This error is thrown if we are in the middle of a leadership election and there is currently no leader for this partition and hence it is unavailable for writes."
      },
      "partitionId": 0,
      "leader": -1,
      "replicas": [],
      "isr": []
    }
  },
  ... // repeats for every topic

the solutions

oleksiyk commented 6 years ago

Topic:partition pairs that received LeaderNotAvailable error during subscribe will be retried to subscribe on each _fetch call. So thats exactly what you name as a second solution:

there needs to be a way to wait for a leader to be elected, and then be able to call subscribe again.