swift-server / swift-kafka-client

Apache License 2.0
81 stars 19 forks source link

Functionality wishlist #75

Open mr-swifter opened 1 year ago

mr-swifter commented 1 year ago

Hello!

As I mention before I really eager to jump off from my home made implementation of Swift Kafka API and use this version and ready to contribute to make it happens. To make it transparent I would like to list the functional gap and create separate issues for every item. Here is the list:

felixschlegel commented 1 year ago

Hey @mr-swifter,

Thanks for this extensive list of items. It aligns very much with what we also imagine the future of this package to be! Here are some things that came to my mind when reading your list:

  • Rebalance callback Redefine rebalancing callback to be able properly react on assign/unassign partitions for all assignment strategies. Be able to seek to specific offsets
  • Statistics callback Listent to Kafka statistics
  • Logging callback - https://github.com/swift-server/swift-kafka-gsoc/pull/61 Redirect librdkafka logs to logger.

We can add #62 to that list. Given that we'd end up with a lot of callbacks, we should think about setting up a general events callback in the future. I think confluent-kafka-go did it nicely.

  • Admin API Create and remove topics, manage consumer groups etc.

We already have some admin methods, but they are only used for testing atm. It would be great to come up with a public API for that and revisit that!

  • Poll outside of cooperative queue Always poll from single task + back pressure from the task (sync domain) to consumer stream (async domain)

We have two open PRs atm #66 #67 that make KafkaConsumer and KafkaProducer expose a method func run() async that runs the poll loop. In the future we also want to conform to the swift-service-lifecycle protocols. Regarding backpressure: afaik the only way to achieve backpressure with librdkafka is to set configuration options like queued.max.messages.kbytes in the KafkaConsumer. Exerting backpressure through not calling poll() does not work as poll() should be called at regular intervals to serve any queued librdkakfa events + librdkafka enqueues incoming events & messages regardless of poll() being called afaik.

  • Statically link with dependencies Get rid of dependencies to be able deploy single binary without prerequisites.

I had a PR for linking librdkafka statically already #49 , though there are some legal question marks whether to us being allowed to ship binaries as part of our package.


At the moment I am working on getting a v1.0.0-alpha release ready and do a thorough API review so that we finally have a stable public interface 😄

Best, Felix

mr-swifter commented 1 year ago

Don't you mind I will create separate issues for items in the list so we can start detailed discussions on particular items?

felixschlegel commented 1 year ago

Yes sure!