fede1024 / rust-rdkafka

A fully asynchronous, futures-based Kafka client library for Rust based on librdkafka
MIT License
1.63k stars 276 forks source link

How to determine severity of underlying librdkafka error? #718

Open quasi-coherent opened 2 months ago

quasi-coherent commented 2 months ago

I have basically this same question. It would be extremely useful to get a KafkaError and know if the client needs reinitialization. There are also the is_retriable and txn_requires_abort methods on RDKafkaError that would be useful in constructing certain application logic.

This PR was merged, which added is_fatal to KafkaError. Then there's some confusing commit history. There's this PR that changes the underlying type you can get from the lower-level error API. Then there's this PR with this commit that removes it. There's not really an explanation other than that it was added unintentionally, with a link to the closed issue I mention at the top where the same contributor seems pretty intentionally adding it. So as it is now, I don't see how it's possible to get that useful information.

Was removing that a mistake? Is there another way to discern the severity of a given error?

quasi-coherent commented 2 months ago

There's not a whole lot I can glean from documentation to confidently know the answer to this. Were all these changes made to encourage use of RdKafkaErrorCode? Does that carry the same meaning? In other words, does

use rdkafka::error::{KafkaError, RDKafkaErrorCode};

impl From<KafkaError> for MyConsumerAppError {
    fn from(v: KafkaError) -> Self {
        match v.rdkafka_error_code() {
            Some(RDKafkaErrorCode::Fatal) => MyConsumerAppError::Fatal(v),
            Some(RDKafkaErrorCode::Retry) => MyConsumerAppError::Retriable(v),
            _ => MyConsumerAppError::GenericKafka(v),
        }
    }
}

achieve the same thing as if is_fatal and is_retriable were exposed?

quasi-coherent commented 1 day ago

Does that carry the same meaning?

As it turns out, the answer is either "no" or "yes, but 'fatal' doesn't mean what you think it does."

For instance, the consumer error UnknownTopicOrPartition is not considered fatal (in that it doesn't have the error code -1 for FATAL), even though trying to subscribe to a topic that doesn't exist strikes me as being pretty fatal to the progress of a consumer.

Edit: Sorry, I re-opened this because the original question isn't quite resolved. And this third comment begs the question of how to better determine the severity of an error if not by inspecting the error's RdKafkaErrorCode variant.