Open dburriss opened 2 years ago
@Grepsy you expressed interest in picking up this issue? Take a read and let me know if still interested.
Yes, I'm still interested ;-)
When considering
The messaging mechanism relies on polling for changes, so it is important that requests for pages, especially the latest page (tail), put as little strain on the server and database as possible.
and
Worker for updating cache of tail page
It seems going for a background worker that frequently updates the tail page might not be beneficial for reducing database strain. The worker updating the cache every x seconds will put a constant load on the database. An on-demand cache (only responding when a request is sent) with a very short expiration will have no load in absence of requests, yet scale up to the same max. load as a background worker under high request load.
If we have a requirement for high-performance near real-time event publishing I think it makes more sense to expose a websocket endpoint where event are streamed instantly instead of polling. This is the same mechanism I often seen used on exchanges publishing real-time price and trade updates. Historical data would of course still be done using the paged API.
It seems going for a background worker that frequently updates the tail page might not be beneficial for reducing database strain.
You raise a good point. The more I think about it, and consider your points here, fleshing out some pub/sub mechanism for when an event is added will give us some options here.
Some quick ideas:
So the above makes me think we can make some assumptions:
If we have a requirement for high-performance near real-time event publishing I think it makes more sense to expose a websocket endpoint where event are streamed instantly instead of polling.
High performance is relative but that isn't the explicit goal. There are plenty of tools that provide that, but are complex to setup and run effectively. Adding websocket may be a good idea but I would like to layer complexity on as needed. Starting with a REST based feed that provides a pull based mechanism needs to be at the core.
Summary
In order for the paging to be as performant as possible, we want to cache page results.
Initial design notes here.
Related concepts:
Motivation and goals
Since the return of events is based on paging, and the list of events is immutable, we can cache any completed page indefinitely since the contents of the page will never change. The messaging mechanism relies on polling for changes, so it is important that requests for pages, especially the latest page (tail), put as little strain on the server and database as possible.
Since caching is so important, relying on response caching can be unreliable. Not only does output caching allow clients to send a
no-cache
header, it is also inappropriate for the tail page as only the server knows when this will change. With only a few clients requesting every second for the last incomplete page, this could put a significant load on the server and database.So a solution needs to consider the following goals:
In scope
Out of scope
Unknowns
IMemoryCache
andIDistributedCache
acceptable for implementation of this (and using existing implementations)