fluent / fluent-plugin-kafka

Kafka input and output plugin for Fluentd
Other
303 stars 176 forks source link

ruby-kafka is no longer maintained and superseded by librdkafka via rdkafka-ruby bindings #500

Open mensfeld opened 9 months ago

mensfeld commented 9 months ago

Describe the bug

I just wanted to raise to your attention that based on the official statement from the ruby-kafka repo:

This library is no longer actively developed and has been superseded by librdkafka via rdkafka-ruby bindings. While this library may still receive security patches and bug fixes, it is no longer recommended for production usage.

ruby-kafka has some critical errors that can lead to data loss. Even zendesk already switched to it with their other operations.

While I cannot replace it fully (rdkafka and waterdrop maintainer here) in this plugin because of my limited knowledge, I am happy to help you guys out :)

The downside is jruby - which is something that will have to be figured out.

To Reproduce

To reproduce or rather see, check the .gemspec

Expected behavior

I would expect this library to use one of the maintained and recommended Kafka gem.

Your Environment

Applies to any environment.

Your Configuration

Applies to any config.

Your Error Log

Irrelevant.

Additional context

Please close it if you feel this is not a bug and it should be raised as a discussion or something else. From my perspective it is a bug as it poses risk of data loss for the end users.

ashie commented 9 months ago

Yes, that's been a concern for us recently. We already have a rdkafka based plugin: https://github.com/fluent/fluent-plugin-kafka/blob/master/lib/fluent/plugin/out_rdkafka2.rb so one of choice is dropping out_kafka2 plugin and maintain only out_rdkafka2.

About waterdrop I've just found out it now. It might be nice to replace ruby-kafka. Thanks for notifying it! We'll consider about it.

mensfeld commented 9 months ago

Your implementation will cause Ruby VM crashes for sure in one place: you cannot destroy rdkafka object on timeout during close if there are ANY outgoing messages. Both this and the events queue needs to be purged: https://github.com/karafka/waterdrop/blob/master/lib/waterdrop/producer.rb#L186

otherwise once in a while there will be dangling data that librdkafka does not tolerate and causes VM failures.

Anyhow keep me posted and happy to help :)

raytung commented 8 months ago

Hey @mensfeld, to clarify, when you mentioned "The downside is jruby - which is something that will have to be figured out.", do you mean waterdrop only supports jruby or that it does not support jruby? If it's the latter, I believe Fluentd currently doesn't support jruby (https://github.com/fluent/fluentd/issues/4098) so it should be alright.

@ashie - I might quickly put together a PoC for waterdrop

mensfeld commented 8 months ago

that it does not support jruby

That. It does NOT support jruby as it uses ffi and C bindings.

mensfeld commented 8 months ago

@ashie feel free to ping me if you need any help :) actually I misread @raytung you ping me if you need any help ;)