Open bplommer opened 3 years ago
KafkaProducerConnection
changes sound reasonable for me. In my practice in most cases I ended up with the KafkaProducer[F, Option[String], Option[String]]
, which is in the essence just an untyped producer.
Not sure about naming. SerializingKafkaProducer
is a bit long. But I don't have a better alternative atm.
Further, there could be a variant that automatically derives keys from values (typically the key would be the same as some kind of id field from the value), so that calling code only needs to provide the value and not the key.
I'm not convinced that this would give an advantage to the users. Automatic derivation could silently derive something wrong for the key, and the compiler will not throw an error in case of an untyped producer. I think in that particular case we should prefer explicit passing of key and value.
I also suggest we add KafkaTopicProducer, which has both serializers and a topic name provided on instantiation and takes just key-value pairs rather than
ProducerRecords
as arguments for itsproduce
method. This would simplify calling code in, I think, the great majority of use cases.
I think it could be useful for simple use cases. KafkaTopicProducer
could be a simple wrapper over KafkaProducer
. But don't forget that ProducerRecord
is not just a key-value pair and topic. It could have partition, headers, timestamps. It should be a part of the API.
Not sure about naming. SerializingKafkaProducer is a bit long. But I don't have a better alternative atm.
Agreed. What about leaving KafkaProducer
as it is and renaming KafkaProducerConnection
to GenericKafkaProducer
?
I'm not convinced that this would give an advantage to the users. Automatic derivation could silently derive something wrong for the key, and the compiler will not throw an error in case of an untyped producer. I think in that particular case we should prefer explicit passing of key and value.
I think I didn't explain my meaning properly. I don't mean the library should guess what the key should be, but rather a function mkKey: V => K
for deriving it would be provided when the consumer is instantiated.
I think it could be useful for simple use cases. KafkaTopicProducer could be a simple wrapper over KafkaProducer. But don't forget that ProducerRecord is not just a key-value pair and topic. It could have partition, headers, timestamps. It should be a part of the API.
Yes, good point.
What about leaving KafkaProducer as it is and renaming KafkaProducerConnection to GenericKafkaProducer?
Sounds good for me.
a function mkKey: V => K for deriving it would be provided when the consumer is instantiated.
After the first reading, I thought you are talking about typeclass to derive a key from the value. Simple function sounds better in this context because it's explicit and simple.
After the first reading, I thought you are talking about typeclass to derive a key from the value. Simple function sounds better in this context because it's explicit and simple.
It could also allow a function K => ProducerRecord[F, K, V]
to allow setting headers too. What do you think about SimpleKafkaProducer
as a name?
I would avoid Simple
prefix because it's too generic. Maybe ValueKafkaProducer
?
How about we rename KafkaProducerConnection
to GenericKafkaProducer
for 2.0, with a deprecated type alias as KafkaProducerConnection
? Then the change should be mostly source-compatible, and it doesn't matter that it's binary breaking.
How about we rename KafkaProducerConnection to GenericKafkaProducer for 2.0, with a deprecated type alias as KafkaProducerConnection? Then the change should be mostly source-compatible, and it doesn't matter that it's binary breaking.
Ok
I’ve been thinking about how we can simplify the most common use cases.
KafkaProducerConnection
really represents an untyped Kafka producer - it easily could, and possibly should, have a methodproduce[P, K: Deserializer, V: Deserializer](producerRecords: ProducerRecords[P, K, V])
.KafkaProducer
is really nothing more than a partial application of that generalproduce
method.I suggest that in 3.0 we rename
KafkaProducerConnection
toKafkaProducer
andKafkaProducer
toSerializingKafkaProducer
.I also suggest we add
KafkaTopicProducer
, which has both serializers and a topic name provided on instantiation and takes just key-value pairs rather thanProducerRecord
s as arguments for itsproduce
method. This would simplify calling code in, I think, the great majority of use cases.Further, there could be a variant that automatically derives keys from values (typically the key would be the same as some kind of id field from the value), so that calling code only needs to provide the value and not the key.
The main drawback I see is that the resulting number of variants could cause confusion, but I think careful API design would mitigate that.
Any thoughts? @vlovgr @LMnet