duraspace / pcdm

Portland Common Data Model
http://pcdm.org/models
Apache License 2.0
90 stars 11 forks source link

Remove the Work class from the Works extension #62

Open escowles opened 8 years ago

escowles commented 8 years ago

The works extension includes a Work class for representing the top level of an item (e.g., an entire book as opposed to its pages).

Discussion of the Work class focused on problems, such as an item being a Work in its own right, but also part of another Work (e.g., a page of a book, which is a map that might be included in a map collection). The consensus was that because of the different contexts, it didn't make sense to label something as a Work, and it would be better to just make everything an Object and use context to figure out whether something was an independent entity or part of another Object.

Should we remove the Work class from the Works extension and/or PCDM 2.0?

See #53 for preliminary discussion.

acoburn commented 8 years ago

It may be worthwhile to consider this in the context of ore:ResourceMaps, which are used precisely to describe aggregations such as this (i.e. one ReM may describe a particular page of a book, while another ReM describes the graph consisting of the entire book.

tpendragon commented 8 years ago

I'm :+1: for removing the Work class.

@acoburn We talked about this some, but I'm not convinced that's what a ResourceMap is. It "describes aggregations", but by grouping statements that have the subject of the aggregation, and I certainly don't think they describe different graphs. Can you build an example of what you're thinking?

azaroth42 commented 8 years ago

:+1: to remove, and :-1: to ResourceMaps. ResourceMaps are the information resources that describe aggregations -- they're essentially the serialization with its own metadata and cannot exist without an aggregation. They're (mea culpa) a librarian view on the importance of a "record" when we didn't really understand how the global graph was going to pan out.

cmharlow commented 8 years ago

@azaroth42 - that explains a lot. thanks for clarifying!

DiegoPino commented 8 years ago

@azaroth42 now you got me confused. So are you removing ReM from the ORE specs? Without ReM how do you known where one aggregation starts and ends? In a semantic point of view, they still make sense to me, but if they will be removed from ORE would be good to know.

azaroth42 commented 8 years ago

No, there's no process to update ORE. Not sure what you mean by start and end? In terms of membership, they start with the first (cough, actually any random) aggregated resource and end with the last (cough, any other random) aggregated resource. Resource Maps are just a not-very-concise bounded description or hack at implementing a named graph, circa 2006.

DiegoPino commented 8 years ago

@azaroth42 i was constantly interpreting ReM as named graphs(they have unique URL and are defined as a subsclass of rdfs:Graph). Well, i was mistaken!

tpendragon commented 8 years ago

Thanks for the clarification, @azaroth42

cmharlow commented 8 years ago

+1 removing hw:Work / not including it in PCDM (without much more use casing, discussion). I agree that PCDM can figure out contextually what it needs, when it needs it, about the whole / part information captured now with hw:Work if we just keep pcdm:Object. I have also wondered if hw:Work was more about capturing what is the primary discovery resource in the repository than any idea of Work completeness (just a thought, and probably an incorrect one).

But more for me, I think we aren't using hw:Work with a firm enough definition to make it really meaningful - which, no fault there, pinning down 'completeness' is often hard + arbitrary, especially with the wide variety of materials we see in digital repositories. Doubly especially with hw:Work often sitting at the intersection of digital object structure + intellectual... conceptualizations?.

IMHO, I think implementations can decide on linking or typing pcdm:Object with other ontologies for the contextual descriptive needs in play for the repository or the resource types' domain (domains here == Musical Scores, or Paintings, or Newspaper Articles, etc.). From that, we can approach the other possible uses for capturing Work / Part separation that fall in bibliographic or cultural heritage resource descriptive practices (and avoid getting PCDM via Hydra-Works brought into those discussions, or worse, discussed with a different understanding of Work attached to it).

acoburn commented 8 years ago

w/r/t ResourceMaps, I do not see them as an ideal way to express and describe an aggregation of resources. If I were to invent a vocabulary, I'd see no problem with putting descriptive metadata directly on the aggregation-like resource.

That said, pcdm:Object < ore:Aggregation and pcdm:hasMember < ore:aggregates, and so by RDFS entailment, any pcdm:Object is also an ore:Aggregation. And furthermore, according to the ORE spec, any ore:Aggregation MUST have a corresponding ore:ResourceMap.

And as much as I find ResourceMaps unnecessary, I find that simply ignoring a spec to be even worse and not something I can recommend to my institution. To put this another way, if PCDM is not interested in following the semantics of ORE, why doesn't it just use DC: pcdm:Object < dcmi:Collection and pdcm:hasMember < dcterms:hasPart?

cmharlow commented 8 years ago

Heya @acoburn - I completely feel what you're saying here:

And as much as I find ResourceMaps unnecessary, I find that simply ignoring a spec to be even worse and not something I can recommend to my institution.

Has this been brought up before in PCDM discussions, does anyone know? If so, is this a good time to bring this thread back up / if not, should we start that discussion thread and figure out if 1. we want to start using ResourceMaps or 2. really push ORE to stop using Resource Maps or 3. we stop subclassing ORE ?

Also, Aaron - any hesitancy against dropping specifically the Work construct from HydraWorks? I apologize if I'm missing the point with your previous discussions of Resource Maps in the context of this question - I'm still wrapping my head around all this. Thank you for your input!

scossu commented 8 years ago

:+1: to dropping Works.

acoburn commented 8 years ago

@cmh2166 I have no particular opinion on Works: I don't use it and have no plans to use it. What I do need is a mechanism for identifying "Top-Level" resources -- the role that I understand Works to fill. For that, I plan to use ResourceMaps (others may have different approaches here -- OWA). If y'all want to drop Works, it sure sounds like a good idea to me, but again, I have no real opinion on it.

escowles commented 8 years ago

👍 to dropping the Work class.

Though if the relationship to ORE and our lack of ResourceMaps is troubling to people, then we should address it. I am not hearing any support for requiring ResourceMaps, and it sounds like ORE isn't going to be updated to make them optional. So I think documenting our decision not to include them in PCDM would be best way forward. I think that would look something like:

cmharlow commented 8 years ago

Thanks again, @acoburn for the response and explanation.

What I do need is a mechanism for identifying "Top-Level" resources -- the role that I understand Works to fill.

This helped confirm my understanding of the possible use cases for Work. I agree that I would leave the delineation of 'top level resources' to implementation decisions.

@escowles I agree with the need to address this issue. Do we want to break it out into a different thread, so folks are aware it surfaced (in case they're not watching this thread, as Works is Hydra-specific)? It would give a chance for folks to confirm the points you make (that i agree with), namely:

azaroth42 commented 8 years ago

Europeana and DPLA also do not require resource maps on their sub-classes of ore:Aggregation. We're in good company here.

Also, "top level" is context specific. The "top level" of a photograph in an album in a box in a collection is dependent on what your frame of reference is. If you're searching for photographs, it's the photograph. If you're browsing the things in the collection, it's probably the album... unless you're in an archival system at which point it could be the box.

DiegoPino commented 8 years ago

@azaroth42 yes, i would like to know the where abouts of that, reading now the Europeana OWL ontology at https://github.com/europeana/corelib/blob/v2.3/corelib-edm-definitions/src/main/resources/eu/rdf/edm-v524-130522.owl

but, sorry, I certainly don't like

Also, "top level" is context specific. The "top level" of a photograph in an album in a box in a collection

An ontology that does not define "what a photograph is " can't, should not, should avoid being context specific to a "photograph"

cmharlow commented 8 years ago

Heya Diego-

Maybe referring to this? https://github.com/europeana/corelib/blob/master/corelib-edm-definitions/src/main/resources/eu/rdf/edm.owl#L125 But I’m just guessing, and not sure I’m looking at the right version of the Europeana Data Model Ontology. But playing with their SPARQL Endpoint, you can find a few of those Aggregations and see they don’t have ResourceMaps, per the ORE spec. As far as I’m understanding it, here are some Europeana Aggregation resources, which you can see don’t have ore:describes (or a subclass I’m aware of) linking to ore:ResourceMaps (or a subclass I’m aware of).

Also, do you think PCDM should define photographs? Generic top-level objects, so to speak? I’m a bit unclear there, sorry. I’m happy to share more thoughts here, but I want to make sure I understand your concerns before. It ties back, possibly, to my thoughts that the Work could be a useful concept, but it needs to be discussed, scoped, etc. lots more before then, and I don’t think this particular Work construct from Hydra-Works fits the bill.

Thanks for all your thoughts!

From: Diego Pino Navarro notifications@github.com Reply-To: duraspace/pcdm reply@reply.github.com Date: Wednesday, August 24, 2016 at 11:26 AM To: duraspace/pcdm pcdm@noreply.github.com Cc: Christina Harlow cmharlow@gmail.com, Mention mention@noreply.github.com Subject: Re: [duraspace/pcdm] Remove the Work class from the Works extension (#62)

@azaroth42 yes, i would like to know the where abouts of that, reading now the Europeana OWL ontology at https://github.com/europeana/corelib/blob/v2.3/corelib-edm-definitions/src/main/resources/eu/rdf/edm-v524-130522.owl

but, sorry, I certainly don't like

Also, "top level" is context specific. The "top level" of a photograph in an album in a box in a collection

An ontology that does not define "what a photograph is " can't, should not, should avoid being context specific.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/duraspace/pcdm","title":"duraspace/pcdm","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/duraspace/pcdm"}},"updates":{"snippets":[{"icon":"PERSON","message":"@DiegoPino in #62: @azaroth42 yes, i would like to know the where abouts of that, reading now the Europeana OWL ontology at https://github.com/europeana/corelib/blob/v2.3/corelib-edm-definitions/src/main/resources/eu/rdf/edm-v524-130522.owl\r\n\r\nbut, sorry, I certainly don't like \r\n\r\n\u003e Also, \"top level\" is context specific. The \"top level\" of a photograph in an album in a box in a collection \r\n\r\nAn ontology that does not define \"what a photograph is \" can't, should not, should avoid being context specific."}],"action":{"name":"View Issue","url":"https://github.com/duraspace/pcdm/issues/62#issuecomment-242105645"}}}

DiegoPino commented 8 years ago

@cmh2166 hi, EDM does not define ReM, but their aggregation does subclasses from 2 other classes, which leads to certain flexibility in a open world assumption. Still i know they have a task group and i would really love to know their reasons behind this and how they justify skipping the specs, because at least i don't know how to tell my algorithms, to not take certain ontologies in account! 😄

I certainly feel PCDM should not define photographs, nor books, not anything in that domain of knowledge == > real entity to semantic web translation. It should stick to structural, like technic lego pieces. But that is my feeling.

And also, by that reason top level can not be context specific, because the context that was defined here is out of PCDM's scope. The reason to have a top level definition is because when you want to fetch a resource, you need to know where the strong relationship between resources is defined and when you are diverging to a new concept. Each aggregation "bundles" resources that belong together, whatever the higher semantics you want to co-assign (schema:book, etc). Without the top level definition, everything in the same paths (reachable via N hops in that graph) belongs to the same meaning which is not true, i need to have a split graph and be able to say: this all belongs to a page, this all belongs to the book, without having to resort to book or page ontology only (if i need to do that, then having aggregations are not needed anymore, i just use domain specific ontologies)

But, as always, its subject to discussion of course.

Thanks

cmharlow commented 8 years ago

So EDM does what PCDM is doing, per @azaroth42's original point. And they have the same question to answer that @DiegoPino + @acoburn brought up.

Okay, good we agree on structural definitions for PCDM, i.e. the repository objects. I wasn't sure from your previous comments, but thought this was the case.

And the last bit hits on the point we reached before, I think, which is hydra-works:Work maybe doesn't fit the bill of what is needed here, but possibly something else is needed in PCDM core to approach this need, i.e. :

i need to have a split graph and be able to say: this all belongs to a page, this all belongs to the book, without having to resort to book or page ontology only (if i need to do that, then having aggregations are not needed anymore, i just use domain specific ontologies)

So the ideas before was that you would use your own implementation decisions to determine this, but you (sorry, @DiegoPino here) think it should be in PCDM. Do you have a recommendation for what this would look like? Do you think it would be Work?