Islandora / documentation

Contains islandora's documentation and main issue queue.
MIT License
104 stars 71 forks source link

Content model objects in Islandora/F4 #31

Closed mjordan closed 8 years ago

mjordan commented 9 years ago

This is a followup to a discussion started during the March 27 meeting of the Fedora 4 Interest Group.

During the call, we discussed the impact of not using content model objects (as Islandora currently uses) but instead expressing collections as RDF properties on member objects. Some of the implications we identified are:

Topic for discussion: is it preferable to model collections content models as full Fedora 4 objects or to express collection membership content models as RDF simple string properties. Examples of implementations to illustrate your arguments would be useful.

DiegoPino commented 9 years ago

Sorry, i could not attend that meeting, but I'm not quite sure if using arbitrary values on e.g in rdf:type to denote membership to a cmodel would be the best idea in terms of how we traverse our relations tree(graph) in the future and how we make our objects and relations discoverable for linked data compliant applications. Some quick questions first:

" but instead expressing collections as RDF properties on member objects." Do you mean expressing cmodel membership/classificactions as RDF properties? I see in nicks diagram rdf:type for this.

Looking at the PCM there are some basic definitions we could use and extend.

In bot cases, rdf:type still applies, but to a real/ontology defined class. I though that was the idea behind pcdm, extend to suit our needs. I think CMODEL play an important role for developers(users don't even know whats behind) and also for making objects portable. If not present all is tied to our fronted.(with it benefits and drawbacks).

mjordan commented 9 years ago

@DiegoPino, in response to your question

"but instead expressing collections as RDF properties on member objects." Do you mean expressing cmodel membership/classificactions as RDF properties? I see in nicks diagram rdf:type for this.

Yes, content models, not collections, sorry about that. I've edited the original post to reduce further confusion.

If I recall the discussion at the meeting, the proposal was to not represent content models as Fedora objects, but to define a content model as a simple string literal property of an Islandora object. If a content model was only identified by a literal string, it could not have properties of its own. @ruebot's example diagram uses the 'hasParent' predicate but now I see at https://github.com/Islandora-Labs/islandora/blob/7.x-2.x/docs/technical-documentation/migration.md that the proposed replacement is now 'rdf:Type'. Perhaps he can elaborate on that choice.

You make a strong argument for retaining content models as objects, especially as sub-classable objects. But, subclassing content models begs the question of modeling them in the first place as objects, e.g., do we have a implementation mechanism for extending or refining parent model classes as in your image/large image example).

ruebot commented 9 years ago

tags @rosiel since she had some good thoughts and opinions on the call too.

islandora:root -- It looks like we should preserve this as Danny, Jared, and I continue working through the migration-utils. Rationale is the fcrepo4 tree structure. It is more efficient as a deep tree instead of wide tree. So, we're thinking of proposing an "islandora:root" for each Islandora (thinking of the multi-site use case).

As for content models, "proposal was to not represent content models as Fedora objects, but to define a content model as a simple string literal property of an Islandora object" that was just my initial inclination. But, after our discussion, I'm not sure this is the best way forward since we need to differentiate between the confusion around content models. That said, would on of y'all be willing to lay all this out in a use case? Then, I'd be more than happy to work on updating the basic model I have stared.

mjordan commented 9 years ago

@ruebot Does retaining islandora:root allow for content in the repo that is not a descendent of that collection (i.e., not a member of any Islandora collection)? That was a use case I brought up during the call.

DiegoPino commented 9 years ago

I would love to give this a try next week, but i still have this question about where/when we are enforcing/checking/applying our ontology(base or an extended one), maybe inside the triple store?. This is a main topic in my opinion, because, as i wrote, it's possible to define every CMODEL as just a subclass definition(with it's own particularities) inside an ontology (describe in owl). This way a Image object could be an instance of a specific subclass of pcdm:Object , like, e.g isla:ImageObject . But if we are leaving the ontology just as a reference, then we can't enforce this properties and must hardcode on whichever side we choose the structure/properties/restrictions of an isla:ImageObject or do as we do now, create a object of type cmodel that glues everything.

Basicly my idea is to extend the pcdm ontology to something more particular. Every "solution pack" could then add a chunck of ontology to this base-islandora-extended one describing how should this new subclass of objects should be like.

About islandora:root. I agree. Fedora4 allows multiple type of relations, so it has it's base "tree" but also the multidimensional/flexible graph formed by relations between objects (hope so or i'm doomed!)

ruebot commented 9 years ago

@mjordan I'm not sure what you mean. Do you mean in the Islandora context? Or, just throwing whatever you want to throw in fcrepo4?

mjordan commented 9 years ago

@ruebot Throwing whatever I want into the same instance of fcrepo as the one that powers my Islandora sites.

ruebot commented 9 years ago

@mjordan I don't see why not. That is at the fcrepo4 level, not at the Islandora level. You can do that now the with fcrepo3 and Islandora.

mjordan commented 9 years ago

@ruebot thought so but we did discuss this during the call, just confirming.

DiegoPino commented 9 years ago

@mjordan, what @ruebot says is true. Even when islandora:root is in place, how your objects inter relate outside this functional definition is up to you.

mjordan commented 9 years ago

@DiegoPino WRT your main topic, that's what I was wondering about when I said "implementation mechanism for extending or refining..." Inheritance, subtyping, and object chaining would be very useful. Just thinking out loud, but since Islandora is implemented in PHP (because of Drupal), if we defined Islandora objects as PHP classes, could we map PHP's OO implementation to Islandora objects' implementation. So content-model-oriented solution packs would not use XML to define content models, they'd use PHP classes. Kinda getting off topic here...

DiegoPino commented 9 years ago

@mjordan Not sure about how to implement/describe this in code, but sounds nice!. The logic behind Ontologies (owl/rdf) is 100% class oriented, pure sweet objects-class theory, but ontologies describe some hard to process definitions(needs some rules-system+reasoner implementation), like restrictions, domains, etc. But i'm still thinking about this implementation: describing CMODELs as owl/rdf and leave the logic of interpreting this definition to PHP (or camel?ja!). Think of a CMODEL as a traversable owl/rdf structure, a graph with some conditions that can be fetched and applied when building a new object or an interface on drupal. I'm already doing this on Fedora 3 for some pretty complex stuff, but it's a lot of code i'm sure nobody(except me) will we happy to maintain. But speaking for this model/approach: our structure wouldn't be obscure anymore for other systems and subclassing is simple. Just adding new owl files somewhere (still thinking where). So we could define (just) a way of understanding owl in PHP (oo oriented) and then build our local magic using drupal.

rosiel commented 9 years ago

@DiegoPino That's exactly what I have in mind though i haven't implemented it yet. I love the idea of OWL ontologies for relationships between CMODELs, because they're so powerful and can define possible relationships (with those domain/ranges you mentioned). However, as in my use case the ontology i'm interested in is a generic ontology (i.e. it defines concepts like "Image" and "Performance" and "Expression"), it doesn't fully describe the kinds of data objects that I'll be using. So I'll need custom classes/cmodels that inherit from these generic classes (maybe defined in a custom owl ontology, put... somewhere?). I want my "content classes" to have their own metadata requirements/interfaces, and display code. Is that still the plan for F4?

DiegoPino commented 9 years ago

@rosiel , your needs are in my opinion a perfect use case of ontologies overlapping and/or subclassing. The big question is how we extend our base to have a base definition for our data and where/how do we allow such alternative ways of visualising/relating the same data to happen. If we use pcdm as base, as we are planning to do, and having every object being an subclass of ore:Agreggation, then this is a must read: http://www.openarchives.org/ore/1.0/datamodel#ReM-to-aggr.

I'm still trying to fully understand how the "resource map" comes in place in F4, if it's hardcoded or just a theoretical concept? (help!). I can also fully imagine a case where we allow multiple ontologies to exist. So, at the islandora side/official, we start by extending our official pcdm by deriving classes and adding simple ontologies that extend to CMODEL concept, but we also allow to "classify" objects inside a different semantic world with a whole different definition that at some point (like a directed property) converge to the base one. The "put… somewhere" question is my concern also, and also "the how we process/query these". Your use case can also be modelled using pcdm, and derived from the all-permiting and flexible pcdm:object definition by describing a very generic cmodel and then subclassing again? Having rdf and owl statements gives us such a flexibility. Mmm, maybe we should start drawing some diagrams...

whikloj commented 9 years ago

I may have missed the point (it happens sometimes) but my understanding was that we would not be getting rid of the core content-models, but perhaps we wouldn't have to install them into ever single Islandora repository.

This would require some different thinking to maintain these objects in a central place, but for those of us that don't alter the core content models it makes perfect sense (especially if they are not referenced continuously). It would also not stop anyone from extending an existing Islandora content model for their own needs.

DiegoPino commented 9 years ago

I'm a bit confused and still thinking a lot of what would be the best way of implement this, in terms of

The problem (in my humble opinion) about having the CMODEL definitions in some central place @whikloj is that, somehow we will have to ask for it's definition when someone needs to ingest an Object, if we are enforcing that definition of course. We could also preprocess that info once and store it in some kind of drupal structure (cache, db, whatever), but still, we would need to have "someone" serve the data for us,at least once, that means maybe many requests -> that means resources, uptime, etc = money. What i would like to investigate is how we can store (triplestore) that info once in every repo on "solution pack install", if we still have that. OWL files are RDF's and can be queried via Sparql or directly using a nice php api like Graphite. Protege has a Sparql for OWL interface that can be used to play a bit and make some tests. This could allow an intermediate solution, no need for "objects" that define CMODELs, but still locally available. How is Hydra managing this?

Back to basics: i think we can discuss this and others options on the next meeting, even the possibility to going back to a simple enumerated data property.

daniel-dgi commented 9 years ago

We can always have centrally hosted versions but cache them locally and have cron check for changes once a day. That way your app doesn't break if the central server goes down. We should do the same for LOC stuff. Remember when the US Gov't shut down?

ruebot commented 8 years ago

See also #179

dannylamb commented 8 years ago

Closing old issues. Will be bringing up OWL ontologies and our usage of them in newer ticket with an MVP context.