zalando / nakadi

A distributed event bus that implements a RESTful API abstraction on top of Kafka-like queues
https://nakadi.io
MIT License
958 stars 292 forks source link

Add starting offset attribute to event type partition info #816

Closed PetrGlad closed 6 years ago

PetrGlad commented 6 years ago

This is an API improvement. Our Nakadi client is archiving incoming streams of events in an external persistent storage. Our goal is to store all events without data losses.

For our use cases we would like to get starting offset of event partition that would be consistent with other parts with Nakadi API. Namely, in other parts of API starting offset is "offset of first event - 1". But information about even type partition from partition_get provides oldest_available_offset which is offset that points exactly to fist available event. So to get starting stream offset that is consistent with other parts of API we would need to use cursor arithmetics to subtract "1" from it.

We know that there is placeholder offset begin that can can be used to specify oldest offset of the stream whatever it is. But in our case we would like to

In latter case we can work around by subtracting 1 from distance after cursors comparison with cursor-distances. But for check pointing we use starting offset, for instance, as an offset marker for stored data. Normally we use last event offset from last received event batch for this. But in case of restarts we might find that older events are already discarded and then we would like to still have actual value of "begin" offset for the same purpose.

We can use Partition's oldest_available_offset as stream start but then this would make use to lose oldest event if we reset to this position, and this would introduce inconsistencies in our completeness checks where we detect data losses. This is especially visible problem for infrequent events that are generated in intervals comparable or longer than retention time. So for such events available event range normally consists of 1 or 2 events or is empty.

In all our cases inclusive beginning offset makes it an additional special case that have to be handled separately in a Nakadi client's code. E.g. starting offset comparison, empty or 1 event stream case, and so on. Also it requires use to use cursor arithmetics with shifted-cursors to work around this inconsistency.

To make changes backwards compatible I suggest adding a new attribute to Partition description that would point to "oldest_available_offset - 1". The attribute name could be, say, starting_offset.

To clarify:

PetrGlad commented 6 years ago

If there are no objections I probably can implement this myself.

lmontrieux commented 6 years ago

Thank you @PetrGlad for your suggestion.

We reviewed it internally, but decided that it would be better not to implement this request. Here are the reasons that motivate this decision:

Nevertheless, your proposal highlights a rough edge in Nakadi, and we will aim to improve our API and/or documentation to make its usage easier. If you have questions and/or alternative suggestions, we will of course be glad to help, and consider alternatives carefully. Feel free to re-open this issue if you would like to discuss it further.

PetrGlad commented 6 years ago

I think Nakady should take responsibility of isolating clients from backend details and present consistent view of event stream in any case.

Yes, as I said in my description the new attribute does have risk of confusing users, this is one of possible backwards-compatible API changes. I think there could be alternative solutions.

In our case BEGIN being always up-to-date is disadvantage. We do want to fail fast if we miss data and see exactly how many events have we lost.

I hope you're OK with us doing things like

   if (oldestCursor.offset().endsWith("--1"))
      oldestCursor
    else
      subscriptionManager.shiftOffsetBackByOne(oldestCursor)
PetrGlad commented 6 years ago

Now that we're working on new version of Diga, I think that we're going to use either lower level API or create subscriptions every time we start stream. That would allow us to avoid offset resets altogether.