openactive / realtime-paged-data-exchange

OpenActive Realtime Paged Data Exchange Specification
https://www.openactive.io/realtime-paged-data-exchange/
Other
6 stars 1 forks source link

Losing updates when using afterTimestamp and afterId #95

Open nathansalter opened 4 years ago

nathansalter commented 4 years ago

Hello,

I started to look through the documentation and I noticed a slight problem with the afterTimestamp/afterId method of iterating through pages. This is fine for pages in the past, but pages happening after the current timestamp can possibly lose items from the subsequent pages. Consider the following updates in this sequence:

Operation ID Timestamp
Create 1 1585747490
Create 2 1585747491
Update 5 1585747492
Update 4 1585747492
Create 7 1585747493

Now if a client views a page between the Update of ID 5 and the Update of ID 4, they will get this page:

Operation ID Timestamp
Create 1 1585747490
Create 2 1585747491
Update 5 1585747492

However, using the afterId of 5 and afterTimestamp of 1585747492 will produce this next page:

Operation ID Timestamp
Create 7 1585747493

Because the ID for 4 is sorted after the afterTimestamp so cannot appear in the page starting with the afterId of 5 so the update is lost.

I'm not sure how this problem could be fixed in the specification, except by removing the requirement that items MUST be sorted by id and instead allowing them to be sorted by the order in which the updates actually occur. We've mitigated this issue slightly by using microseconds rather than seconds in the timestamp but the problem could still occur on high-traffic websites.

Does anyone have any better suggestions?

nickevansuk commented 4 years ago

Great find - and somewhat related to the race condition challenge.

As you say using a timestamp field with a high degree of accuracy certainly mitigates it, but to completely mitigate the issue you've identified it is also necessary to filter out items with a "modified" date after 2 seconds in the past, to delay items appearing in the feed. As the above linked guidance mentions, many systems still exhibit small variances in the timestamps they provide, which this also covers.

This delay allows items with a specific modified value only to be read after it is not possible for further items to be allocated to the same modified value - which should solve the above?

As you've pointed out, this issue exists in all cases - such that the above delay should be implemented as standard practice, and not only where transactions are involved.

What do you think?

nathansalter commented 4 years ago

Delaying items displaying in the feed definitely stops this issue, as long as you are using a consistent time source in the database. If you're not, you'd have to use change numbers instead anyway so that's not an issue.

I think recommending using higher accuracy timestamps and only displaying items a few seconds in the past definitely mitigates this issue and should stop it from being a problem. Great idea!