harlow / kinesis-consumer

Golang library for consuming Kinesis stream data
MIT License
263 stars 88 forks source link

How to obtain ShardID from within ScanFunc? #120

Closed jtackaberry closed 4 years ago

jtackaberry commented 4 years ago

I need to implement custom checkpointing by manually calling SetCheckpoint() after some asynchronous processing of the consumed Kinesis events has completed.

The problem is that the ScanFunc only receives kinesis.Record which has the partition key, but not the shard id, so I'm not able to invoke SetCheckpoint().

Is there a way for the ScanFunc callback to somehow discover the corresponding shard id of the record it was passed?

Thanks!

harlow commented 4 years ago

Hi @jtackaberry unfortunately no, there currently isn't a way to get the shardid from the record. the layer of abstraction wasn't designed to support this.

trying to get my head around whether doing something like this could make sense. or if there is an easier way to weave the ShardID through to your callback func


type Record struct {
  kinesis.Record
  ShardID
}
jtackaberry commented 4 years ago

@harlow this seems like a pretty nice solution, exposing the shard id while also maintaining backward compatibility. I could test this out and submit a PR if you're amenable.

This beats, by a large margin, what I'm currently doing, which is to implement a custom Storage that wraps the supplied DDB Storage, intercept SetCheckpoint() and rather than calling out to the underlying DDB layer, maintain a map of sequence number to shard id, which I can reference later from the asynchronous task, which at least knows the sequence number. This kinda works, but it's a frightening hack, and I'm not sure there is any guarantee of sequence number uniqueness across shards (though collisions seem unlikely, it's not strictly correct).

Thanks for the speedy reply!

harlow commented 4 years ago

I could test this out and submit a PR if you're amenable.

yeah sounds great. i'm game to merge that in if it feels good

jtackaberry commented 4 years ago

Great. I have it working now. Will do a bit of testing tomorrow and fire off a PR. Thanks!

jtackaberry commented 4 years ago

Closing now that #121 is merged. Thanks for the insights, Harlow.