RuckusWirelessIL / pentaho-kafka-consumer

Apache Kafka consumer step plug-in for Pentaho Kettle
Apache License 2.0
66 stars 40 forks source link

Controlling Offsets #13

Closed daviesgj2 closed 8 years ago

daviesgj2 commented 8 years ago

Is it possible to control when an offset is written by PDI, i.e., I would like to update the offset once the PDI transform has completed successfully rather than when it read the message.

spektom commented 8 years ago

Hi,

No, for now it's not possible.

On Tue, Jul 5, 2016 at 3:42 PM, daviesgj2 notifications@github.com wrote:

Is it possible to control when an offset is written by PDI, i.e., I would like to update the offset once the PDI transform has completed successfully rather than when it read the message.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/RuckusWirelessIL/pentaho-kafka-consumer/issues/13, or mute the thread https://github.com/notifications/unsubscribe/AAJpOwjjxL1Wg4pYGOfIJXKetPCbjswjks5qSlEwgaJpZM4JFE6H .

pentaho-aschurman commented 8 years ago

I wanted to introduce an Idea that could help to SOLVE this.

In general, the current architecture of PDI does not allow an INPUT step (In this case KAFKA Consumer) to check if the ROW/MESSAGE was completely processed before flagging it as successfully read. but there is an alternative...

The Idea is based on a new TYPE of Kafka Consumer Step for PDI that instead of a INPUT STEP, it extends the Single Threaded Step or the Execute Transformation Step and for EACH message it calls a Sub-transformation to process the message. Based on the transformation result SUCCESS/FAIL, it can make different options with the message, one of them flagging as READ or not.

This method would open a new world of possibilities

spektom commented 8 years ago

Wouldn't Kafka consumer step represent a bottle-neck in such case?

dcohen24 commented 8 years ago

@pentaho-aschurman Could you explain your thought a little more? is there a way to see the result of that child transform? (I have need for a similar functionality, and may be able to spend some time building out that functionality).