fcrepo / fcrepo-specification

Fedora API Specification
Apache License 2.0
17 stars 15 forks source link

Requiring support for `Content-Type: message/external-body` #37

Closed emetsger closed 7 years ago

emetsger commented 7 years ago

In section 1.3, HTTP POST:

Implementations MUST support Content-Type: message/external-body extensions for request bodies for HTTP POST that would create LDP-NRs. This content-type requires a complete Content-Type header that includes the location of the external body, e.g Content-Type: message/external-body; access-type=URL; URL=\"http://www.example.com/file\", as defined in [rfc2017].

I'm curious as to the motivation for calling out this specific content type? I initially read it as: a Fedora implementation, upon receiving a POST with Content-Type: message/external-body would dereference the URL and ingest the content of that URL (as an LDP-NR?). Later I understood that the use case was to support the idea of Fedora 3 external data streams.

I wonder if it would be better for this to be a MAY, instead of a MUST, or simply remove this paragraph all together? It seems odd to me that the specification requires all Fedora implementations to carry over a use case from Fedora 3. Or, I may just not fully understand the motivation.

ruebot commented 7 years ago

@birkland https://github.com/fcrepo/fcrepo-specification/pull/54

kefo commented 7 years ago

Three things about this.

1) While my intent is not, yet, to re-open this, I'd be curious if @barmintor or @escowles, after considerable discussion at LDCX, believe external content MUST be supported versus SHOULD. Many current Fedora installations rely on this feature. At AIC we too are evaluating not storing some content in Fedora.

2) MUST was changed to SHOULD in 3.3.1, which is about POST, but not in 3.4.1, which is about PUT.

3) I think the issues @acoburn raises about supporting external-body resources and the impact that support will have on other elements of the specification deserve wider discussion. Such a discussion may surface strategies to help address the issues or at least inform any future discussion about what it means to be "in conformance" with the specification.

escowles commented 7 years ago

@acoburn @kefo I agree that we could probably dispense with the redirect functionality and replace it with a triple. On the other hand, there was some discussion here at LDCX yesterday about proxying instead of redirecting (using message/external-body; access-type=local-file), so I think it's worth considering that angle too.

ajs6f commented 7 years ago

I have no idea what local-file means in the context of a distributed impl.

escowles commented 7 years ago

@acoburn @ajs6f I agree, it's unclear to me what some of the access-type values would mean in different contexts (and even whether proxying or redirecting is a better fit, or should be indicated separately). There's currently no mention of access-type in the spec, and maybe there's nothing more we can add. I've been trying to figure out if there's anyone with hard requirements for proxying vs. redirecting, but so far all the use cases I've heard sound like either approach could work.

The two other things I would add related to that are:

In short, I think proxying vs. redirecting boils down to: do we mean that Fedora is managing content in an external system, or do we mean that the client has a reference to content it wants to mange outside of Fedora?

ajs6f commented 7 years ago

In short, I think proxying vs. redirecting boils down to: do we mean that Fedora is managing content in an external system, or do we mean that the client has a reference to content it wants to mange outside of Fedora?

Yes. And while the former is more powerful, it is also harder and makes much stronger demands on impls.

barmintor commented 7 years ago

Links require the client to be able to resolve them, which is not the case in the scenarios in which installations use external file data. In this scenario Fedora doesn't manage the contents of the file path, it points to them.

ajs6f commented 7 years ago

@barmintor Isn't that what @escowles said above? Or are you making a claim about all "scenarios in which installations use external file data"?

barmintor commented 7 years ago

I'm only reiterating that proxied content can't be replaced with Location headers.

birkland commented 7 years ago

At JHU, it is likely that we'll use API-X to help proxy content that not otherwise http accessible to the client if/when the need comes up. That way we can adapt to the relevant details as they come up, rather than relying on support from a particular fedora impl.

ajs6f commented 7 years ago

@barmintor That is my understanding of what @escowles said. I don't think anyone is arguing that proxy and reference are interchangeable...

barmintor commented 7 years ago

@ajs6f I'm sorry I fell off this, and also sorry for misunderstanding what @acoburn was saying. I'm fielding specific requests for proxied content, and that's also the migration impediment Columbia has, so my hackles got up at links addressing it. We use file: URIs to point to preservation content and specifically expect that the client cannot resolve them.

I'm trying to put together a document with some explanation of the proxy, redirect, and reference use cases that might be communicated with external-body that should close #41 and result in a PR for LDP-NR.