iipc / jwarc

Java library for reading and writing WARC files with a typed API
Apache License 2.0
48 stars 8 forks source link

Rudimentary Memento support on replay #11

Open machawk1 opened 5 years ago

machawk1 commented 5 years ago

I noticed that replaying WARCs provides a 14-digit datetime placeholder. As I anticipate this will eventually be semantic, it need not necessarily be. However, providing Memento (RFC7089) HTTP response headers would give some temporal context to the capture.

As a start, initially providing the Memento-Datetime HTTP response header (in RFC1123 format, e.g., Memento-Datetime: Fri, 09 Jan 2009 01:00:00 GMT) when viewing a capture from the WARC would be useful for further integration into other systems.

ato commented 5 years ago

Yep. I agree.

ato commented 5 years ago

We should also support the Accept-Datetime request header. Need index improvements (#12) for that.

machawk1 commented 5 years ago

@ato I agree re:Accept-Datetime but I figured getting the capture/memento to report what it is would be a start for further negotiation.

For the endpoint that is receiving the Accept-Datetime header, be sure to have the Vary: Accept-Datetime in the response (see RFC7089 §2.1.2).

For the capture (memento) itself, you may also want to report a Link response header to relate the capture (at a URI-M) to the live web URI (URI-R) with rel="original", e.g., example in §4.1.1.