ntent / kafka4net

C# client for Kafka
Apache License 2.0
52 stars 32 forks source link

No exceptions thrown when all brokers are not available during startup of producer #19

Open torokoh opened 9 years ago

torokoh commented 9 years ago

Hi. I'm trying to determine how will the Producer behave when it is not able to connect to any brokers in the Kafka cluster during startup. Based on my testing, no exceptions thrown when the brokers are unavailable. Am I doing anything wrong?

This is how I tested:

1) Shutdown all Kafka broker processes 2) Run a simple producer and call await ConnectAsync() 3) Send a message

Some context on why I need to do this: It is possible for my application to be disconnected from the Kafka cluster at anytime. Hence, to prevent any data lost, the application will cache the data on the computer and send the data to the cluster upon re-connection.

vchekan commented 9 years ago

The idea is to try as hard as possible to deliver data. Even if there is no broker available, it does not mean, that this is a permanent condition. It could be maintenance downtime for example. Driver can not know how critical it is for application to deliver data within timeout period.

I can think about scenario when data become stale within certain timeout or when alert should be sent.

You can have some visibility of errors by subscribing to Producer.OnTempError and OnSuccess, and keeping track of how many messages there are in the error condition and for how long. But perhaps, driver can expose some API to let the app do this tracking easier.

What is your use case?

vikmv commented 9 years ago

By the way, I've tried to find the usages of Producer.OnTempError, and the only usage was this one: OnTempError = null; (in Producer.CloseAsync) Probably, that's not how that supposed to be.

vchekan commented 9 years ago

@vikmv oh, you are right. It fell through the crack somewhere. Marking as a bug.

torokoh commented 9 years ago

@vchekan I'm installing an application across many machines to collect some statistics. These machines are connected to the Kafka cluster through an internal network. They are mobile and can be disconnected from the Kafka cluster for a prolong period of time and I want to minimize data lost. That is why there is a need to cache the data on the client machine.