Open c-thiel opened 4 weeks ago
Turns out there is something to discuss right from the get go.
I kind of expected that there would be one clear choice for a kafka rust lib, but at least at first glance, there is not.
There seem to be two more or less mature implementations available:
Seems to be a pure rust implementation. A first look at the examples shows that it seems to be pretty easy to use. There a few things to mention, though
... is actually "just" a safe interface to librdkafka
I am not sure what is the best choice here. If introducing a c lib is not an option, rust-kafke seems to be the only choice. If it is ok to schlepp around a c lib, rust-rdkafka is also async and seems to be "endorsed" by Cloud Events SDK.
Or maybe I am overlooking "that other kafka rust lib", that has less downsides than kafka-rust or rust-rdkafke :smile:
I went looking and found this, it's rather new but looks promising?
https://www.reddit.com/r/rust/comments/1ehpjgh/rust_native_kafka_protocol_and_client/ https://github.com/CallistoLabsNYC/samsa
In terms of maturity & user-base it probably makes sense to stick to rdkafka for now, eventually we should switch over to a rust-native implementation to get rid of the C dependency.
@twuebi
samsa
indeed looks promising, but from your second comment I gather: rdkafka
it is, for now.
Three questions:
rdkafka
, or add the dependency our self?librdkafka
or depend on an existing version?I'd probably vote
nats
is usedI'd say let's give rdkafka a try then, we should probably depend on cloudevents sdk's packaged rdkafka, from a cursory read, it seems that serialization of cloudevents to kafka is a bit more involved than what we do for nats, compare cloudevents-sdk-0.7.0/src/binding/nats/serializer.rs:19
with cloudevents-sdk-0.7.0/src/binding/rdkafka/kafka_producer_record.rs:24
.
We've gone for depending on async-nats directly since cloudevents didn't package async-nats IIRC.
Existing publishers can be found in crates/iceberg-catalog/src/service/event_publisher.rs:166..
@c-thiel @twuebi
I just realized, that the latest release of cloudevents sdk depends on rdkafka ^0.29. Current release is 0.36.2.
0.29 is almost 2 years old. It depends on librdkafka 1.9, which is also almost 2 years old. Current version of librdkafka is 2.5.
The main branch of cloudevents sdk is already on ^0.36
Tbh, I am not sure what would be a good way to solve this 🙈
Hm, unfortunate, I'd say either ask CloudEvents-sdk for a release or vendor their serialization code for rdkafka
If venodring is an option, I will do that. I can continue (well, start...) working and should also make things easier if or when cloud events sdk release a new version.
Regarding asking cloud events sdk for a release: maybe something you could or should do @twuebi? I'd maybe feel a bit uncomfortable since this is not my codebase 😅
then let's start with vendoring
I would like to give this one a shot