Open nathansalter opened 4 years ago
Great find - and somewhat related to the race condition challenge.
As you say using a timestamp field with a high degree of accuracy certainly mitigates it, but to completely mitigate the issue you've identified it is also necessary to filter out items with a "modified" date after 2 seconds in the past, to delay items appearing in the feed. As the above linked guidance mentions, many systems still exhibit small variances in the timestamps they provide, which this also covers.
This delay allows items with a specific modified
value only to be read after it is not possible for further items to be allocated to the same modified
value - which should solve the above?
As you've pointed out, this issue exists in all cases - such that the above delay should be implemented as standard practice, and not only where transactions are involved.
What do you think?
Delaying items displaying in the feed definitely stops this issue, as long as you are using a consistent time source in the database. If you're not, you'd have to use change numbers instead anyway so that's not an issue.
I think recommending using higher accuracy timestamps and only displaying items a few seconds in the past definitely mitigates this issue and should stop it from being a problem. Great idea!
Hello,
I started to look through the documentation and I noticed a slight problem with the afterTimestamp/afterId method of iterating through pages. This is fine for pages in the past, but pages happening after the current timestamp can possibly lose items from the subsequent pages. Consider the following updates in this sequence:
Now if a client views a page between the Update of ID 5 and the Update of ID 4, they will get this page:
However, using the
afterId
of5
andafterTimestamp
of1585747492
will produce this next page:Because the ID for
4
is sorted after theafterTimestamp
so cannot appear in the page starting with theafterId
of5
so the update is lost.I'm not sure how this problem could be fixed in the specification, except by removing the requirement that items MUST be sorted by id and instead allowing them to be sorted by the order in which the updates actually occur. We've mitigated this issue slightly by using microseconds rather than seconds in the timestamp but the problem could still occur on high-traffic websites.
Does anyone have any better suggestions?