TREEcg / event-stream-client

Deprecated! Use the rdf-connect/ldes-client instead
https://github.com/rdf-connect/ldes-client
Other
14 stars 9 forks source link

Bookkeeping strategy in an LDES and resuming using a bookkeeper state #33

Open pietercolpaert opened 2 years ago

pietercolpaert commented 2 years ago

The most generic strategy that doesn’t need to interpret anything about a specific member:

The bookkeeper keeps a list of visited pages. Based on the max-age header or based on a polling interval when an etag is given, a next polling moment is kept in a dictionary. When a cache-control header is immutable, then the visited URL is added to a memory efficient table that keeps track of all visited pages.

Another bookkeeping is needed for emitting the individual members of pages that are not immutable. When a page is fetched again, we need to check whether the member was already emitted. Mind that this is not a list of all members, only a list of members that are part of pages that still may change.

This state of both the pages bookkeeper and the member bookkeeper must be able to be exported when the LDES client sleeps (e.g., in the case of LDES action) and imported again when the LDES client is instantiated again to resume.

KasperZutterman commented 2 years ago

To implement this behavior several enhancements are needed (split in their own issues for ease of implementation):