Open acoburn opened 5 years ago
I would consider these implementation decisions. Tagging @hvdsomp.
I think it would help to bridge LDP and Memento by defining:
I used broader language above so that while Memento is the main candidate, it can potentially be extended - but that's just speculation for now.
I think the Solid spec should draw from or align with https://fcrepo.github.io/fcrepo-specification/#resource-versioning . We need to process/digest this properly and should have some implementation experience before throwing something into the spec.
We can answer the questions you've raised once we have some consensus that:
I'm not certain myself but would like to have a discussion to what degree Memento is required or baked into Solid servers. From an application's perspective, I think it'd would be great to have it (so that's more like a MUST) but from a server's perspective, MUST is a high requirement in that basically we're forcing all servers to make a certain "promise" about resources. Server or resource "owners" may not want to commit to that obviously or even be particularly useful/sensible for the kind of data they are generally storing/serving. SHOULD/MAY? We can however at the very least use language like "if a server supports Memento, then it MUST..." which will allow interop with clients with Memento know-how.
As interesting and useful as Memento is, I strongly suggest a MAY only. Memento is an orthogonal specification; no need to mandate it.
Pretty much everything in Solid is an orthogonal spec. The spec already (or plans to) mandate a number of things, so Memento is no different, if of course desired. What the Solid spec would be doing is clarifying the relationships between Memento and everything else in Solid. Or more broadly resource versioning. I think that is needed and now is a good time to map that out.
I'm hoping to extend the discussion and the implications of a system (and eventually the ecosystem) that is generally aware of versioning. Understanding that better would help to determine if MUST/SHOULD/MAY.
Pretty much everything in Solid is an orthogonal spec.
Agreed; I meant orthogonal to the whole of what we now consider as "Solid", i.e., the combination of LDP+WAC+Auth.
I would need to see very strong arguments for anything other than a MAY. Of course, we can also have a trivial implementation of Memento which can easily be a MUST, i.e., Memento says that versioned resources must support datetime conneg. A server can easily be fully compliant with Memento without supporting it, namely by just not offering versioned resources at all.
I would very strongly suggest only a MAY too (simply because I think it's a value-add feature that many implementors may not want to be burdened with). I also think it offers nice differentiation (and justification) for Enterprise implementations to provide lots of nice features around versioning.
But I do agree with Sarven's suggestions for the spec to provide guidance for implementations that choose to implement Memento to help interoperability.
I was hoping that we continue to develop our understanding before jumping into a decision on the conformance level. Without looking at use cases, having some implementation experience or at the very least having confidence on the kind of space we envision, the decision is arbitrary. It entirely depends on our assumptions and we ought to document some scenarios which grounds our reasoning.
While what may happen out there is at best a speculation, I thought that MUST may be difficult to achieve for implementers and resource "owners". I think we generally agree on that, but that doesn't imply MAY. Not MUST is not a free pass to MAY. Can't go from "difficult to enforce" to MAY. All we can say for now is that, it is probably not a MUST and that we need to bring forth more arguments and discussion. We need "very strong arguments" for MAY as well as anything else. Calling MUST, SHOULD, MAY or nothing at all is relatively the easy part once we know exactly where we are heading. How important is it? What do we gain/lose? Tradeoffs? Risks?
There is nothing particular about MAY that's preferable than not saying anything. Sure it is a good signal - we think it is relevant and makes sense. However, ultimately it is not required for interop with the exception of prescribing some glue for LDP/Memento for those that care: "if you do happen to implement x, make sure sure to do y". After all, we can throw in MAYs for whatever we think of is interesting or useful out there. So what? The point is that even if/when we eventually arrive at a MAY, we more or less need to hint at the why in the spec, and not "oh four people showed up on github and thought really hard about it" :)
Apologies for a late response. I've been feeling a bit under the weather. A few things:
While a Memento resource may carry with it a promise of immutability, there is a very practical need to be able to delete these resources.
In order to comply with legal and/or privacy-related rules (e.g. GDPR), it will be absolutely necessary for an implementation to have some mechanism to purge data from its history. Whether that mechanism is part of the public HTTP interface can be a separate question, though it is worth noting that even the Fedora API supports DELETE for Memento resources.
I very much agree with that, but, with my answers, wanted to point out what the expectation per the Memento spec is. Reality can obviously differ. We (creators of the Memento spec/tools) had conversations like this with the editors of the Fedora API spec, who were dealing with challenges that were very similar to the ones you describe. The Fedora API spec reflects the results of those conversations.
I have to ask a dumb question because I want to be clear. Perhaps I'm overlooking the obvious thing. As I understand it, the resource state (as per AWWW) that Memento refers to is about the representation, and so the promise of immutability is that it didn't change. If so: does that actually exclude the case where a resource ceases to exist? After all, we can't get to the resource's state to determine if it changed or not. Hence, wouldn't that permit deletion of resources without conflicting with expected Memento behaviour?
That's pretty meta, but yes ;-)
Generally speaking, the intent is that a resource representation doesn't change once it's been flagged as a Memento. But, reality doesn't make it easy to live up to that promise. For example, in web archiving, where the Memento protocol is used abundantly:
Regarding the latter, RFC7089 has some language that has its grounding in digital preservation practice:
Although a Memento encapsulates a prior state of an Original
Resource, the entity-body returned in response to an HTTP GET request
issued against a Memento may very well not be byte-to-byte the same
as an entity-body that was previously returned by that Original
Resource. Various reasons exist why there are significant chances
these would be different yet do convey substantially the same
information. These include format migrations as part of a digital
preservation strategy, URI-rewriting as applied by some Web archives,
and the addition of banners as a means to brand Web archives.
Great discussion (and very interesting pointers from Herbert!).
So my probably overly simplistic summary: Memento (or more generally 'resource versioning') is not a MUST in the spec. We can offer guidance though, saying if implementations want to offer versioning, they SHOULD strongly consider Memento (for read). We can also offer the guidance that if they want to support version write they SHOULD strongly consider the Fedora API. We can expand that guidance further by explaining that in reality resources can always be deleted (giving the great example of GDPR's 'right to be forgotten'), and can also 'change' at the byte level (giving Herbert's great examples above). And finally, Solid servers that delete a resource MUST (I'm not 100% on that yet!) also delete any associated versions (e.g. Mementos).
I take Sarven's point about needing further thought - but is this the general direction?
Alternatively, we can make Memento a MUST if the server supports versions. But version support could still be a MAY.
Perhaps meta-ish. I wanted to explore and not step on any toes. If we take it as is and we want to say something about deleting Memento resources, then "SHOULD NOT delete Memento resources" will fit. That also works for legal and/or privacy-related cases. We don't want to encourage deleting, so we don't say MAY delete.
Related issue: https://github.com/solid/specification/issues/46 where the current consensus (warning: not final or official in any way at the time of this writing) seems to be that Solid's position should be the same as LDP's ie. "LDP servers should not re-use URIs" (re AWWW). So, that may also mean that while it is technically and socially allowed to delete a Memento resource, and even reuse the same resource for something completely different, one really should not. That possibility seems completely silly of course, but I think it is fair to acknowledge that here (for whomever is reading this in the future). I can't think of a particularly good reason right now why that may happen other than to create misinformation, or best case would be accidental (still unintentionally harmful). Perhaps out of scope for Solid but worth to acknowledge nevertheless. If Solid can do something about that, we should (re: ethical web principles).
Client request to create a URI-R where server creates URI-M and includes Memento headers. Without the header, server only need to create a regular resource without URI-M. Server could of course always create URI-Ms and create/update URI-T depending on the activity. AFAICT, this aligned with the client request to create a Memento resource: https://fedora.info/2018/11/22/spec/#resource-versioning . Straight forward IMO.
PUT https://csarven.ca/linked-research-decentralised-web
Link: <http://mementoweb.org/ns#OriginalResource>; rel="type"
201 Created
Location: https://csarven.ca/linked-research-decentralised-web
GET https://csarven.ca/linked-research-decentralised-web
Link: <http://mementoweb.org/ns#OriginalResource>; rel="type"
Link: <https://csarven.ca/linked-research-decentralised-web.timemap>; rel="timemap"
Link: <https://csarven.ca/archives/linked-research-decentralised-web/ce36de40-64a7-4d57-a189-f47c364daa74>; rel="memento"
GET https://csarven.ca/archives/linked-research-decentralised-web/ce36de40-64a7-4d57-a189-f47c364daa74
Link: <http://mementoweb.org/ns#Memento>; rel="type"
Link: <https://csarven.ca/linked-research-decentralised-web>; rel="original"
Link: <https://csarven.ca/linked-research-decentralised-web.timemap>; rel="timemap"
Memento-Datetime: Mon, 22 Jul 2019 16:03:11 GMT
GET https://csarven.ca/linked-research-decentralised-web.timemap
Link: <http://mementoweb.org/ns#TimeMap>; rel="type"
Link: <https://csarven.ca/linked-research-decentralised-web.timemap>; anchor="https://csarven.ca/linked-research-decentralised-web"; rel="timemap"
Edit: Corrected to use anchor
in TimeMap resource example.
A few remarks:
This seems to follow the recommendations of the Fedora API.
Looks like there will be no TimeGate and hence no datetime negotiation in this setup. Rather a client will need to parse the TimeMap to find a Memento that meets its datetime preferences. Note that e.g. using the mementoweb/timegate server (see https://github.com/mementoweb/timegate) an external TimeGate could still be provided. Might be good to consider whether this could be supported using a configuration option that allows indicating an external TimeGate, which would result in providing "timegate" links where appropriate.
The "original" link in the response to the GET on the TimeMap URI is not correct. The link should have the URI of the Original Resource as explicit anchor.
Cheers
Herbert
On Mon, May 4, 2020 at 12:32 PM Sarven Capadisli notifications@github.com wrote:
Client request to create a URI-R where server creates URI-M and includes Memento headers. Without the header, server only need to create a regular resource without URI-M. Server could of course always create URI-Ms and create/update URI-T depending on the activity. AFAICT, this aligned with the client request to create a Memento resource: https://fedora.info/2018/11/22/spec/#resource-versioning . Straight forward IMO.
PUT https://csarven.ca/linked-research-decentralised-webLink: http://mementoweb.org/ns#OriginalResource; rel="type" 201 CreatedLocation: https://csarven.ca/linked-research-decentralised-web GET https://csarven.ca/linked-research-decentralised-webLink: http://mementoweb.org/ns#OriginalResource; rel="type"Link: https://csarven.ca/linked-research-decentralised-web.timemap; rel="timemap"Link: https://csarven.ca/archives/linked-research-decentralised-web/ce36de40-64a7-4d57-a189-f47c364daa74; rel="memento" GET https://csarven.ca/archives/linked-research-decentralised-web/ce36de40-64a7-4d57-a189-f47c364daa74Link: http://mementoweb.org/ns#Memento; rel="type"Link: https://csarven.ca/linked-research-decentralised-web; rel="original"Link: https://csarven.ca/linked-research-decentralised-web.timemap; rel="timemap"Memento-Datetime: Mon, 22 Jul 2019 16:03:11 GMT GET https://csarven.ca/linked-research-decentralised-web.timemapLink: http://mementoweb.org/ns#TimeMap; rel="type"Link: https://csarven.ca/linked-research-decentralised-web; rel="original"
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/solid/specification/issues/61#issuecomment-623386266, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA5CGA2FFVXMK3HLVORWGYLRP2KT7ANCNFSM4IY3JHRA .
Herbert Van de Sompel Chief Innovation Officer DANS herbert.van.de.sompel@dans.knaw.nl +31 6 22 83 93 15 https://hvdsomp.info https://orcid.org/0000-0002-0715-6126
The suggestion was primarily about having the client request to create URI-R and have its URI-M available. I realise there are different reasons to support TimeMap and TimeGate. Would it make sense to require one (TimeMap in my opinion) in order for servers and clients to interop, and have the other (TimeGate) as optional?
If a server implements Memento, is there a reason why snapshot discovery and negotiation not available for any resource? Is it something that the server should just handle itself without any interface with the client ie. having the client to request to create URI-R.. in the first place?
(jumping in)
there's certainly a preference for servers (esp. CMSs) to implement TG/TM, but there's no explicit requirement. As @hvdsomp said, external links are fine (even though their knowledge might not be complete).
The minimum threshold is providing Memento-Datetime and link rel=original, and other servers can build TGs/TMs from that. You can also think of the (public) web having implied TG/TM links to archive.org, and the server overrides those when it knows of "better" TGs/TMs.
Maybe this shares common ground with something we called solid-link-metadata. See https://pdsinterop.org/solid-link-metadata/ It can be used to create archives in combination with a archivedate and content hash it opens opportunities to serve and/or store older data. More in #136
The Solid Ecosystem document mentions Memento as an optional dimension for content negotiation.
Memento, itself, defines a mechanism whereby a client can discover and retrieve previous states of a resource. Memento does not define how these previous states (Mementos) are created or otherwise managed. Is this an implementation decision or will the Solid specification take a stance on issues such as: