Get initial batch from changefeed

hpx7 commented 9 years ago

For example, consider

r.table('scores').orderBy({index: r.desc('score')}).limit(10).changes()

There doesn't seem to be an easy way to determine on the client when the initial batch (of <= 10 items) has been recieved.

Having this information allows a client to distinguish between "data loading" and "data changing".

hpx7 commented 9 years ago

I do understand that the absence of the old_val key indicates that a particular row is part of the initial batch; however, this is not sufficient in telling us which row represents the final one of the initial batch.

Consider the example above. In the case where the scores table contains less than 10 rows to begin with, we are unable to determine the end of the initial batch on the client.

mlucy commented 9 years ago

@hpx7 -- there isn't really a good way right now. We went back and forth on this, but decided against complicating the format for the initial release. What sort of interface would you like to see?

hpx7 commented 9 years ago

Looking at the javascript driver, it seems like the EventEmitter interface could be useful - there could be a loaded event which indicates the end of the initial batch. Not sure how this would be implemented in the other drivers though.

Another option I see is simply adding a field to the rows returned from the changefeed (in addition to new_val/old_val). Perhaps an integer remaining field or simply a boolean lastOfBatch field.

Something like the "remaining" concept could possibly be implemented on the cursor level instead as well.

mlucy commented 9 years ago

A pretty reasonable protocol implementation would be to add a field {note: "last_initial_val"} to the last initial document. That would let people who don't care about this information ignore it, and it would be backwards-compatible so it would avoid breaking any code that was written for 1.16 changefeeds.

kofalt commented 9 years ago

In general, with the scope of queries that can be watched for changes expanding, it would be nice to have a way to clearly distinguish between:

The results of the query as though I had not appended .changes()
Any changes / deltas following that result set.

From the current documentation:

The first notification object in the changefeed stream will contain the query's initial value in new_val and have no old_val field.

This sounds a little ambiguous in the case of multiple initial documents, but it sounds like one can unambiguously distinguish any query's initial results from its change events by waiting for the first response with an old_val key.

If so, then adding a note key makes sense as we won't have to block on the first change to make that distinction. This conversation is strongly related to the return_initial proposal in #3579.

Tryneus commented 9 years ago

Maybe this is crazy, it'd probably require redoing the protocol a bit, but what if we returned two cursors, one for the initial value and one for new changes?

kofalt commented 9 years ago

Handling the case where a query may return more than one cursor sounds like a lot of complexity; can you envision a use case where you'd both want the initial data set, and want to read new changes before reading said initial data?

hpx7 commented 9 years ago

The only issue I see with the {note: "last_initial_val"} solution is that it requires the database to know, when publishing a particular row, whether that row is the final one. I can imagine scenarios in which this may not be a desirable property.

One solution that doesn't have this problem is simple adding a {ready: true} message to the protocol, but extending the protocol to support this may break backwards compatibility.

kofalt commented 9 years ago

I suppose the {note: "last_initial_val"} option will also not work in the case where a query happens to not return any values initially, which is something you may not be able to predict depending on your current application state.

mlucy commented 9 years ago

That's a fair point about not being able to attach the note if there are no initial values. Maybe breaking backwards compatibility is better then. (We could also omit the {ready: true} documents by default and have an optarg to turn them on; that would preserve backwards compatibility while letting people who really need this information get it.)

kofalt commented 9 years ago

That seems reasonable. To be clear, its sounds like the resultant usage would be:

Set an optarg, mark_ready or something, on a query with .changes()
Handle documents lacking a ready key as a query result (extracting from new_val key)
Read a document with a ready key, discard
Handle all future documents as a changefeed event (with new_val and old_val keys).

Which does not require the database know which is the final document in a query result, does not block on change events for a client to know the full result set, and also handles the case of no results.

As a bonus, aside from extracting the new_val key in step 2, you can pretty much use identical code to process query results with or without changefeeds.

neumino commented 9 years ago

Is it possible to add the documents to the feed only when their value is added to the initial stream? The "only" consistency required is that the documents are up to date.

So I'm fine with the following situation:

Ask for a change feed on a range that contains doc1 and doc2.
Update doc1.0 to doc1.1 and doc2.0 do doc2.1
Retrieve doc1.0 and doc2.1 and get only the change for doc1.0 -> doc1.1

kofalt commented 9 years ago

If I understand your example correctly, that sounds like it has more to do the MVCC nature of the database than separating initial results from changefeed events.

I don't think adding {ready: true} behaviour would would effect your example in any way.

I would expect: doc1.0, doc2.1, ready, doc1.0 -> doc1.1.

neumino commented 9 years ago

Yes, that works for me.

Basically what matters for my use case is to be sure to have the documents up to date, not to get the changes from a precise snapshot (that's why initially retrieving doc1.0 and doc2.1 is fine as long as I get the following update doc1.0 -> doc1.1).

That being said, I have no idea how much work it would be : )

marshall007 commented 9 years ago

This was brought up in the Q&A after the 1.16 webinar around 23:02 (YouTube link) and it sounded like the justification for not simply returning the entire result set on every change was to reduce network overhead. I can appreciate why this behavior is desirable but I think for a lot of cases the simplicity of receiving the entire result set every time is worth it for simplicity.

So maybe we could have a return_vals or include_all optarg.

mlucy commented 9 years ago

@marshall007 -- could you open a separate issue for the include_all optarg?

@neumino -- we currently don't support return_initial on tables or ranges, but when we do it will probably have the semantics you describe.

danielmewes commented 9 years ago

The point about return_initial on tables is a good one. Some of the possible implementations for that would actually send changes for part of the range out before the initial results for the whole table have been completely sent.

So the semantics proposed by @kofalt

Set an optarg, mark_ready or something, on a query with .changes()

Handle documents lacking a ready key as a query result (extracting from new_val key)

Read a document with a ready key, discard

Handle all future documents as a changefeed event (with new_val and old_val keys).

would need to be relaxed a little bit in that case. We could still send the ready key once we are done with all initial results, but we might send some update events (with both old_val and new_val) even before that point.

Apart from that detail I quite like the proposal.

mlucy commented 9 years ago

I think we should do what @kofalt suggested, but with the relaxed semantics @danielmewes mentioned (i.e. you might get some changes before the {ready: true} document, but when you get the document you'll be up-to-date).

The only thing that I think is left to discuss here is names and generality. I think we might want to have a more general interface where when you call query.changes(note_states: true), you will get documents telling you what state the changefeed is in. Right now the states would be {state: 'initializing'} and {state: 'ready'}, but we could add more in the future. For example, if we decide to support resumable changefeeds with backfilling semantics, we could add a {state: 'backfiling'} or something.

kofalt commented 9 years ago

We could still send the ready key once we are done with all initial results, but we might send some update events (with both old_val and new_val) even before that point.

That strikes me as a reasonable compromise, as long as there's an unambiguous way to distinguish a query result from a changefeed event. My current understanding would accomplish that by checking for the existence of an old_val key.

I think we might want to have a more general interface where when you call query.changes(note_states: true), you will get documents telling you what state the changefeed is in.

We've implicitly already agreed on this, but I would definitely want those status notifications to be separate documents, and not attached to query/change documents.

kofalt commented 9 years ago

Reconsidering for a moment, I have two comments on @danielmewes' proposal.

The first is that it could lead to rather unintuitive results. Asking for .changes() on a large query result (such as a table) could result in:

A change event for doc X, from v1 to v2
The query result for doc X, still at v1.

In that order. I would consider this confusing for RDB users who have not read this discussion.

Notably, I'm not familiar enough with the underlying implementation to be sure if this is possible. I defer to RDB contributors to consider or dismiss the above scenario, and if it's an concern.

Second and more importantly, it significantly increases the complexity of a user's mental model and client code. In the "leaderboard" use case, a client consuming in-order documents is two for loops; nearly identical to consuming changefeeds without return_initial.

Meanwhile, consuming out-of-order documents instead comprises of incrementally building both the result set and change set simultaneously, and triggering code when the former is complete. Based on your application, the changefeed may be useless to you until you have all the initial data. At this point, I notice that my code is reimplementing RDB's internal changefeed buffer, and poorly at that.

Would it be reasonable to request a further option to give a harder guarantee on in-order "query THEN changes" results? Overly large result sets (like an entire table) strike me as mainly useful for things like replication, and are inherently a more advanced use case than the aformentioned leaderboard query. If you set in_order: true or something, and the changefeed buffer spills before you read all the query results, that's your fault :)

Maybe so much so that it's worth offering the guarantee by default, as the consumption of an out-of-order document set feels more complicated to explain than it ought to be for average use cases.

mlucy commented 9 years ago

I think we should preserve ordering on a per-primary-key basis. So you can get [{new_val: {id: 0}}, {old_val: {id: 0}, new_val: {id: 0, a: 1}}, {new_val: {id: 1}}, {state: 'ready'}], but you could never get [{new_val: {id: 1}}, {old_val: {id: 0}, new_val: {id: 0, a: 1}}, {new_val: {id: 0}}, {state: 'ready'}]. So when you get a change, you'd always have a document to apply it to right away rather than having to keep it around in a separate queue.

kofalt commented 9 years ago

I think we should preserve ordering on a per-primary-key basis. ... you'd always have a document to apply it to right away

That doesn't sound too bad.

You'd know better than I would if that implies troubling performance down the road, as your proposal would require remembering the set of primary keys already sent. I'm envisioning a troublesome query like r.table('a').orderBy('z').changes(), where z is not the primary key. (Whether or not that query could be expected to work well in the first place is left as an exercise for the reader.)

mlucy commented 9 years ago

There seems to be agreement on the general nature of this but not the specifics. @danielmewes, @coffeemug -- how do you feel about my state proposal above?

danielmewes commented 9 years ago

The state proposal sounds good.

coffeemug commented 9 years ago

@mlucy -- I also like the general proposal, but could you write up a complete spec/summary here so there is no confusion? In particular, there are questions about names, possible states and what they mean, optargs/names (if any), etc.

mlucy commented 9 years ago

We add an optarg note_states to changes. It defaults to false but can be set to true on any changefeed.
If note_states is true, we will occasionally insert documents of the form {state: X} into the stream of changes.
One value for X will be initializing. We'll send {state: 'initializing'} as the first document of a feed which returns initial values.
Another value for X will be ready. We'll send {state: 'ready'} whenever we're done sending initial values. (So for a changefeed which doesn't return initial values, {state: 'ready'} will be the first document.)
We reserve the right to add more states in the future.
We reserve the write to send changes before {state: 'ready'}, even though we don't do so right now, but if we do so then we'll guarantee that the initial value for a particular primary key is always seen before any changes to that primary key.

mglukhovsky commented 9 years ago

Is note being used as a verb here? I assumed it was a noun, and got confused. Can we use a clearer verb, like include_states?

coffeemug commented 9 years ago

:+1: for include_states

Also, moving to 2.0 after talking to @mlucy (since it's easy to do and pretty important).

danielmewes commented 9 years ago

I didn't like include_status in this context at first. But I can't tell why and also cannot come up with a better name (I had considered send_status, but don't like that much either).

Hence include_states is good.

danielmewes commented 9 years ago

Also :+1: on all the other details in https://github.com/rethinkdb/rethinkdb/issues/3709#issuecomment-73609858

I think we can mark this settled.

kofalt commented 9 years ago

+1 on @mlucy's proposal and +1 and for include_states. LGTM

danielmewes commented 9 years ago

Thanks @hpx7, @marshall007, @kofalt for all your feedback in this and other threads by the way! Was very helpful.

neumino commented 9 years ago

Just to be sure that I understood the proposal. Suppose I have 3 documents in my table.

I'll first get {state: 'initializing'}, then some changes, and then {state: 'ready'}.

Suppose that no writes happen on the table, I'll get

[
  {new_val: {id: 1}}, 
  {new_val: {id: 2}}, 
  {new_val: {id: 3}}
]

If some writes happen on the table, I can get changes about the document with id: x only after I receive {new_val: {id: x}}.

If someone insert {id: 4} before ready is sent, will I get {new_val: {id: 4}} then {new_val: {id: 4}, old_val: null}? Or can I get straight {new_val: {id: 4}, old_val: null}? Or just {new_val: {id: 4}}?

Could we have a note to distinguish this feed from other feeds? Such that generic libraries that want to build a higher API (like maintaining a list of changing documents) don't have to wait for the first element to know what they are dealing with (similar to the atom feed)?

One last picky thing. While include_states sounds better than note_states, it doesn't quite describe what the option is for. Shouldn't it be more something like initial_values: true (or something better)?

:+1 overall it looks like a clean API :) I would be happy to add that on thinky :-)

danielmewes commented 9 years ago

@neumino

Could we have a note to distinguish this feed from other feeds?

You mean a feed that has include_states enabled? That's a good idea, I think we should do that through our ResponseHints interface (https://github.com/rethinkdb/rethinkdb/issues/3715).

Shouldn't it be more something like initial_values: true (or something better)?

That seems like a much worse description of what include_states does to me. I mean you can get the initial values without include_states.

If someone insert {id: 4} before ready is sent, will I get {new_val: {id: 4}} then {new_val: {id: 4}, old_val: null}? Or can I get straight {new_val: {id: 4}, old_val: null}? Or just {new_val: {id: 4}}?

I think you would either get only {new_val: {id: 4}, old_val: null} or only {new_val: {id: 4}}, but not both.

mlucy commented 9 years ago

This is in next (CR by Marc).

rethinkdb / rethinkdb

Get initial batch from changefeed #3709