Q: when i have this in the producer configuration props.put(TRANSACTIONAL_ID_CONFIG, "my-service"); , why do i still need to put a transactionIdPrefix on the producer factory ?
DefaultKafkaProducerFactory<String, byte[]> producerFactory = new DefaultKafkaProducerFactory<>(producerConfig());
// otherwise transactions don't work
producerFactory.setTransactionIdPrefix("my-service");
Suggestion: ProducerFactory could source the prefix property from the ProducerConfig if not specified directly. The fact that one is a prefix and the other not is not so relevant from an end user perspective.
Gary Russell @garyrussell Mar 15 16:45
@jorgheymans Because we have to manage a pool of producers when using transactions, for concurrency; producers have to have unique transaction ids; hence it's a prefix.
Furthermore, when publishing on a consumer thread, to prevent zombie fencing, the transaction id has to include the group/topic/partition. If you don't care about zombie fencing (needed for exactly once semantics), you can set producerPerConsumerPartition to false, but we still need multiple txids for concurrency.
We could add some code to detect concurrent use of a single producer to allow that configuration, but what is the objection to using the factory property instead of the producer config?
Jorg Heymans @jorgheymans Mar 16 08:55
@garyrussell no objection, it's just confusing to me that both need to be set. From a user perspective it looks to be the same thing. Why not just take TRANSACTIONAL_ID_CONFIG as the default prefix for example?
From gitter:
Q: when i have this in the producer configuration
props.put(TRANSACTIONAL_ID_CONFIG, "my-service");
, why do i still need to put atransactionIdPrefix
on the producer factory ?Suggestion: ProducerFactory could source the prefix property from the ProducerConfig if not specified directly. The fact that one is a prefix and the other not is not so relevant from an end user perspective.
Gary Russell @garyrussell Mar 15 16:45 @jorgheymans Because we have to manage a pool of producers when using transactions, for concurrency; producers have to have unique transaction ids; hence it's a prefix. Furthermore, when publishing on a consumer thread, to prevent zombie fencing, the transaction id has to include the group/topic/partition. If you don't care about zombie fencing (needed for exactly once semantics), you can set producerPerConsumerPartition to false, but we still need multiple txids for concurrency. We could add some code to detect concurrent use of a single producer to allow that configuration, but what is the objection to using the factory property instead of the producer config?
Jorg Heymans @jorgheymans Mar 16 08:55 @garyrussell no objection, it's just confusing to me that both need to be set. From a user perspective it looks to be the same thing. Why not just take TRANSACTIONAL_ID_CONFIG as the default prefix for example?