Open berezovskyi opened 3 years ago
I will start addressing these comments in a new branch b-trs30-draft3, as discussed in the meeting on May 20th. I do not expect all to be resolved in that new draft, so I've added a checklist above, and will mark off the mainly editorial actions as I make them. I will probably leave some open to be discussed later - the topic of discovery, root services, etc. is one of those.
@jamsden do you still want to go over these items?
Yes
We are about to publish trs-v3.0-ps01. @berezovskyi, are the any of these items that should be addressed in ps01?
Jim, I am on vacation this week. The most important issue is to remove the language that allows items to appear out of order, e.g. 5 7 8 9 6
(ie 5
on the first request, 5 7
on the second, then 5 7 8
, then 5 7 8 9
and finally 5 7 8 9 6
). This may be issues 21, 27, 30 but I am not sure unless I cross-check with the draft.
Our client triggers a full rebase every time it encounters such a SNAFU. Removing text from the spec that condones such behavior would allow users to go their vendors and file bug reports asking for events not to be published out of order but at the same time clients will still have the code that works around this.
Note that this has nothing to do with another issue I have that 5 7 8 9 10
is allowed. Better to have skips like 5 7 8 9 10
than OOOE like 5 7 8 9 6
.
Some systems deal with this through watermarking, see https://databricks.com/blog/2017/05/08/event-time-aggregation-watermarking-apache-sparks-structured-streaming.html but we don't want to go there. At the very least, we need to ensure that the order of trs:order
matches the TRS log append order (no pun intended).
I recall David and Nick arguing that it's "hard" to do in a clustered environment. As I said before, AtomicLong
allows to generate 1MM+ TRS event order numbers per second and Hazelcast/Redis allow at least 300k atomic INCR commands across multiple nodes in a cluster on a decent server HW. I am pretty sure a SQL DB with an AUTO INCREMENT key will do the trick, even if 10x slower. I am 100% confident we can enforce both constrains but here specifically we need to deal with out-of-order appends.
re: "language that allows items to appear out of order", I don't see any language that allows or implies that. Can you point to that language?
We should define two separate terms: TR event time and TRS event append time. And then restate those CCs, such that two TR events may appear out of order in the TRS (CC-23), state that TR events may be delayed from being appended to the TRS (CC-25) but once a TRS Client fetches the TRS log and observes a set of TRS events with a trs:order
of up to n
, no subsequent TRS log responses may contain appended TRS events with a trs:order < n
that were missing from a previous response (CC-17). This would mean that if CC-23/25 happen, they will have TRS event order numbers as they are appended to the log, not according to the original TR event time order.
For now the only spelled out guarantee is in CC-36, which says that when you get a new page, it will not contain events with an order number lower or equal to the max order number from a previous page.
CC-25: states under what conditions a TRS server must report a change event, it doesn't address the ordering of the events in the change log.
CC-23: indicates that change log ordering is independent for different resources, the change log order is only relevant for changes to a specific resource - in oder to ensure the state of the resource is accurately reflected over time.
CC-14, CC-17: indicate that the trs:order number sequence must correspond to the actual sequence of changes to the trs:changed resource. But does not say anything about the order of these events in the change log. They might for example appear out of order if members of clustered servers appended to the log in an asynchronous manner.
The client however can always rely on the trs:order numbers, and should not rely on their order in the change log. If they want to process them in the right order, one way would be to sort them before processing them.
So I'm still not sure any change in the spec is needed.
Part 1
trs:trackedResourceSet
property advice would apply to it? Nick: I've removed the reference to root services.trs:ChangeEvent
that is a superclass of Deletion, Modification, and Deletion. Nick: There is no such superclass.trs:cutoffEvent
indicates that the TRS Base lists tracked resources present in the system at the point right after a given event.trs:order
being strictly monotonic but not necessarily sequential. Let's use one term to denote one thing if possible.