duraspace / pcdm

Portland Common Data Model
http://pcdm.org/models
Apache License 2.0
90 stars 11 forks source link

Should Collection/Object be a subclass of ore:Aggregation? #16

Closed jcoyne closed 8 years ago

jcoyne commented 9 years ago

The pcdm:Object and pcdm:Collection are ore:Aggregations. In light of http://www.openarchives.org/ore/1.0/datamodel#Aggregation is it incorrect to do so?

Because a URI-A identifies the Aggregation, it SHOULD NOT be a URI used for another purpose such as the URI of a specific manifestation of some content ... or the URI of a human-readable "splash page", which really identifies only that page and not the Aggregation as a whole.

escowles commented 9 years ago

I think the most troubling aspect is that ORE specifies that the URIs of the Aggregation and ResourceMap(s) must be different:

Each Resource Map MUST be identified by a single protocol-based URI, which MUST be distinct from the Aggregation identified by the Resource Map.

So at a conceptual level, I think the different serializations of RDF provided by fcrepo4 are very similar to the multiple ResourceMaps in ORE -- basically the same metadata in different formats, served using content negotiation. The ORE HTTP implementation guide, provides examples of how to serve the different ResourceMaps using 303 redirects and content negotiation. fcrepo4 behaves similarly, other than serving the different serializations with a 200 response code under the single URI instead of redirecting.

So I think there is definitely some mismatch here. Since @azaroth42 and @zimeon were editors of those ORE docs, I'm interested in whether they think this is a problem or not. Maybe this is just a case of PCDM being more lax than ORE? There's nothing to stop you from implementing PCDM with separate URIs for the pcdm:Collection/pcdm:Object serializations, PCDM just doesn't say anything about it.

azaroth42 commented 9 years ago

A ResourceMap is a special serialization, essentially, that self-describes with format and license. Other serializations aren't forbidden by ORE (afaik).

On the conneg front, ORE does say 303, taking a stance on HttpRange14. I don't think we should be too concerned about that, however one thing to be certain of is that there MUST be a separate Content-Location for each format, which SHOULD resolve separately. I believe that is an HTTP requirement, not a linked data nicety (but happy to either check or be proven wrong!). For example, even if you can get the JSON-LD representation from URI-Object directly with a 200, it MUST explicitly say that it's really URI-Object.jsonld, and if you deref URI-Object.jsonld, you get the same representation. That isn't a PCDM problem however, it's an implementation problem ... and in our case, a question as to whether Fedora4 does that correctly.

Note in the HTTP guide, there's no MUSTs, it's RECOMMENDED to do 303... which is a good recommendation in general, but not essential.

escowles commented 9 years ago

@azaroth42, so it sounds like there isn't a serious model issue here, but there may be an implementation concern about separately identifying Aggregations and ResourceMaps (if the implementation wants to).

I dug around in the HTTP 1.1 spec and found some info about what to do with the Content-Location response header if it's present, but I wasn't able to find anything specifying when to use the Content-Location header, other than RFC 2295 (transparent content negotiation, not sure if that's widely used or not). So I'm not sure whether having separate URIs for the content-negotiated representations is required or encouraged. It does seem like a good idea, regardless.

azaroth42 commented 9 years ago

Yep,. I was thinking of 2295 in the HTTP space... which I agree has not been adopted.

In digging, the reference I was after was the Cool URIs for the Semantic Web W3C note: http://www.w3.org/TR/cooluris/#conneg

And followed up by: http://blog.iandavis.com/2010/11/a-guide-to-publishing-linked-data-without-redirects/

escowles commented 9 years ago

@azaroth42++ I created a ticket to discuss whether fcrepo4 should have separate URIs for the different serializations and use the Content-Location header to point to them: https://jira.duraspace.org/browse/FCREPO-1620

jpstroop commented 9 years ago

:+1:

azaroth42 commented 9 years ago

Thanks Esme!

daniel-dgi commented 8 years ago

Sorry for the necromancy, but this thread does raise an interesting issue.

Extending a data structure as opposed to containing and using one is generally considered poor form, no? I certainly wouldn't programatically implement something like this by extending my language's array construct. Simply having and using ore aggregates would be able to handle a pcdm object needing multiple aggregations more sanely, and wouldn't blend application logic with data structure logic Or is that too simplistic of a reduction? (total possibility)

I know @DiegoPino mentioned this during the transitivity discussion. Are there others with thoughts on this?

azaroth42 commented 8 years ago

The proxy construction for ordering makes everything Aggregations anyway, due to domain and range. So I don't see any benefit to taking it out, or any disadvantage of leaving it all in.

The other option would be to recreate the proxy pattern in a different namespace ... which seems like a worse solution when there's already a good enough ontology available to reuse.

azaroth42 commented 8 years ago

I think the issue can be closed, wontFix? IOW, we're leaving Object and Collection as subClasses of Aggregation

escowles commented 8 years ago

+1 to closing

azaroth42 commented 8 years ago

Closing without prejudice. Please reopen if you think there's more to say.