Based on all the other comments and thoughts here, I wanted to suggest a new proposal, a variation on Option 1 as well as a few other things.
This proposal is optimized for the Memento Reconstruct and http://oldweb.today use case, and considers individual archives as well as aggregators. Using this system, it should be possible to implement aggregators which efficiently query archives that support raw mementos, while filtering out those that do not. Such an aggregator would be suitable for use with the Memento Reconstruct and http://oldweb.today services.
Values for Prefer/Preference-Applied
Prefer: raw - request raw unrewritten content and raw headers, where possible. Hop-by-Hop headers should be prefixed with X-Archive-Orig-
Prefer: rewritten - request rewritten content, suitable for displaying to a user (optional, default preference if omitted)
TimeGate
User makes a request with a preference: curl -H "Prefer: raw" -H "Accept-Datetime ..." "http://archive.example.com/timegate/http://example.com/"
If the archive can satisfy the preference, 302 Redirect is returned with Preference-Applied, Vary: accept-datetime, prefer
If the archive does not support the preference for any URI-R, return 415 Unsupported, Vary: prefer
If the archive does not support the preference for this URI-R only, return 404 Not Found, Vary: prefer
If the archive does not yet support this feature, Prefer not included in Vary (fallback to default Memento behavior)
TimeGate Aggregator
The aggregator TimeGate accepts a Prefer header and passes it on to each individual TimeGate.
If an individual TimeGate returns 3xx or 2xx response and Vary: Prefer and Preference-Applied is present, include the response
If an individual TimeGate returns 415, this preference is not supported by this archive, so no need to query it again for this preference.
If an individual TimeGate returns 404, this preference is not supported for this URI-R, but include it in future queries for other URI-R
If an individual TimeGate does not yet support this extension, eg. no Preference-Applied or no Vary: Prefer, but a valid TimeGate response, treat it same as a 415.
Alternative: have a lax and strict mode that would allow including ambiguous responses, maybe through an additional Prefer: strict or Prefer: laxsetting. For simplicity, just default to strict always.
TimeMap
A TimeMap should also accept a Prefer header and include only responses that satisfy the preference, eg. a TimeMap of Prefer: raw should only include URI-Ms that are raw mementos.
If a TimeMap does not support a specific preference for any URI-R, it should return 415 Unsupported Type and Vary: Prefer
If a TimeMap does not support a specific preference for just this URI-R, it should return 404 Not Found and Vary: Prefer
If the aggregator is cacheing the TimeMap, the key should include the preference and the url.
TimeMap Aggregator
A TimeMap Aggregator should pass the Prefer header to the individual TimeMap URLs, and should merge only 200 responses with Preference-Applied and Vary: Prefer.
Similar to TimeGate, 415 error should indicate don't query again with this preference and should be cached.
Similar to TimeGate considerations, TimeMaps without Vary: Prefer should be considered erroneous and not included, unless supporting a lax and strict option.
Memento
(Optional) Each URI-M includes a Preference-Applied describing the dimension of rawness (so far, just rewritten or raw), the same one as was returned by the TimeGate prior to the redirect, or listed in the TimeMap. The URI-M should not include a Vary: Prefer
Each URI-M can have one and only one Preference-Applied associated with it and it must not change based on any other header.
Based on all the other comments and thoughts here, I wanted to suggest a new proposal, a variation on Option 1 as well as a few other things.
This proposal is optimized for the Memento Reconstruct and http://oldweb.today use case, and considers individual archives as well as aggregators. Using this system, it should be possible to implement aggregators which efficiently query archives that support raw mementos, while filtering out those that do not. Such an aggregator would be suitable for use with the Memento Reconstruct and http://oldweb.today services.
Values for Prefer/Preference-Applied
Prefer: raw
- request raw unrewritten content and raw headers, where possible. Hop-by-Hop headers should be prefixed withX-Archive-Orig-
Prefer: rewritten
- request rewritten content, suitable for displaying to a user (optional, default preference if omitted)TimeGate
User makes a request with a preference:
curl -H "Prefer: raw" -H "Accept-Datetime ..." "http://archive.example.com/timegate/http://example.com/"
If the archive can satisfy the preference,
302 Redirect
is returned withPreference-Applied
,Vary: accept-datetime, prefer
If the archive does not support the preference for any URI-R, return
415 Unsupported
,Vary: prefer
If the archive does not support the preference for this URI-R only, return
404 Not Found
,Vary: prefer
If the archive does not yet support this feature, Prefer not included in Vary (fallback to default Memento behavior)
TimeGate Aggregator
The aggregator TimeGate accepts a
Prefer
header and passes it on to each individual TimeGate.If an individual TimeGate returns 3xx or 2xx response and
Vary: Prefer
andPreference-Applied
is present, include the responseIf an individual TimeGate returns 415, this preference is not supported by this archive, so no need to query it again for this preference.
If an individual TimeGate returns 404, this preference is not supported for this URI-R, but include it in future queries for other URI-R
If an individual TimeGate does not yet support this extension, eg. no
Preference-Applied
or noVary: Prefer
, but a valid TimeGate response, treat it same as a 415.lax
andstrict
mode that would allow including ambiguous responses, maybe through an additionalPrefer: strict
orPrefer: lax
setting. For simplicity, just default tostrict
always.TimeMap
A TimeMap should also accept a
Prefer
header and include only responses that satisfy the preference, eg. a TimeMap ofPrefer: raw
should only include URI-Ms that are raw mementos.If a TimeMap does not support a specific preference for any URI-R, it should return
415 Unsupported Type
andVary: Prefer
If a TimeMap does not support a specific preference for just this URI-R, it should return
404 Not Found
andVary: Prefer
If the aggregator is cacheing the TimeMap, the key should include the preference and the url.
TimeMap Aggregator
A TimeMap Aggregator should pass the Prefer header to the individual TimeMap URLs, and should merge only 200 responses with
Preference-Applied
andVary: Prefer
.Similar to TimeGate,
415
error should indicate don't query again with this preference and should be cached.Similar to TimeGate considerations, TimeMaps without
Vary: Prefer
should be considered erroneous and not included, unless supporting alax
andstrict
option.Memento
(Optional) Each URI-M includes a
Preference-Applied
describing the dimension of rawness (so far, justrewritten
orraw
), the same one as was returned by the TimeGate prior to the redirect, or listed in the TimeMap. The URI-M should not include aVary: Prefer
Each URI-M can have one and only one
Preference-Applied
associated with it and it must not change based on any other header.