buda-base / lds-pdi

http://purl.bdrc.io BDRC Linked Data Server
Apache License 2.0
2 stars 0 forks source link

Ontology service limitation #183

Closed MarcAgate closed 4 years ago

MarcAgate commented 4 years ago

We have two different ontology services: A) html based (serving html through jsps) and B) file based (serving raw files in various formats)

As a reminder, a first limitation is that service A can only be run on the "real production" server corresponding to the "real" ontologies and resource names (in our case "purl.bdrc.io")

In case A, we can serve both baseUris (i.e entire ontology without imports as http://purl.bdrc.io/ontology/admin/ ) or Ontology resources as http://purl.bdrc.io/ontology/admin/Product

In case B, we can serve base Uris with extensions (i.e the entire ontology without imports as http://purl.bdrc.io/ontology/admin.ttl or http://purl.bdrc.io/ontology/admin.json - _

However, we can't, at that time, serve ontology resources as ttl (for instance http://purl.bdrc.io/ontology/admin/AdminData.ttl) the same way we server "normal" bdr resources.

This is hasn't been an issue so far since we don't actually use ontologies resource definitions in our various applications, but this feature has become mandatory since we have to deal with nested Shacl shapes served as ontologies (for instance, we might want to get http://purl.bdrc.io/ontology/shapes/core/PersonShape.ttl)

Which would give us:


<http://purl.bdrc.io/ontology/shapes/core/PersonShape>
        a               sh:NodeShape ;
        rdfs:label      "Person Shape"@en ;
        adm:graphId     bdg:PersonShapes ;
        sh:property     <http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasBrother>,
 <http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasSon> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasMother> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasCousin> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-personStudentOf> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-gender> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-kinWith> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasGrandfather> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasGrandmother> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasGranddaughter> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasParent> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasGrandParent> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-personTeacherOf> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasOlderBrother> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasGrandson> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasSpouse> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasFather> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasSister> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasOlderSister> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasGrandChild> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasYoungerSister> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasYoungerBrother> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasDaughter> ,
 <http://purl.bdrc.io/ontology/shapes/core/PersonShape-hasSibling> , 
<http://purl.bdrc.io/ontology/shapes/core/PersonShape-personName> ;
sh:targetClass  bdo:Person .

This limitation must be avoided.

xristy commented 4 years ago

In case A, we can serve both baseUris (i.e entire ontology without imports as http://purl.bdrc.io/ontology/admin/ ) or Ontology resources as http://purl.bdrc.io/ontology/admin/Product

I don't understand "entire ontology without imports" vs an Ontology resource such as http://purl.bdrc.io/ontology/admin/Product - is this latter w/ imports?

xristy commented 4 years ago

@MarcAgate Sorry to slow but why?

we can't, at that time, serve ontology resources as ttl (for instance http://purl.bdrc.io/ontology/admin/AdminData.ttl) the same way we server "normal" bdr resources.

or in what way are they served differently?

I think all of these variations can be handled with the OntDocumentManager and OntManagerSpec, etc.

I think the following provides all of the variations of models needed w/o additional coding:

    String ONT_POLICY = "https://raw.githubusercontent.com/buda-base/ontology/abstractworks/ont-policy.rdf";
    oms = new OntModelSpec(OntModelSpec.OWL_MEM);        
    odm = new OntDocumentManager(ONT_POLICY);        
    oms.setDocumentManager(odm);
    odm.setProcessImports(true);

then

    String ontUri = "http://purl.bdrc.io/ontology/core/";
    OntModel om = odm.getOntology(ontUri, oms);

Creates an OntModel from which om.getBasemodel() can be extracted w/o any of the imported models. These models were already cached by new OntDocumentManager(ONT_POLICY), so:

    ontUri = "http://purl.bdrc.io/ontology/types/Binding/";
    OntModel om2 = odm.getOntology(ontUri, oms);

and om2.getBasemodel() will return just the base model w/o any imports if there any.

To get a Model (not an OntModel if needed) w/ all the triples from the base and the imports:

    Model m = ModelFactory.createModelForGraph(om.getGraph());

or

    Model m = om.getRawModel()

if no inferencing has been done which is the case with:

    oms = new OntModelSpec(OntModelSpec.OWL_MEM);

or using the OntModelSpec.OWL_DL_MEM which also does no inferencing.

MarcAgate commented 4 years ago

First point:

http://purl.bdrc.io/ontology/admin/Product is actually a class of the admin ontology (imports are irrelevant at the class level).

this is http://purl.bdrc.io/ontology/admin/Product : (not an ontology, just a rdf resource extracted from the admin ontology model)

adm:Product
  a owl:Class ;
  rdfs:comment """A collection of Works that are distributed together. Historically this has meant on a CDROM, DVDROM or hard drive. Nowadays, Products are used mostly to collect Works that are made available to institutional subscribers online via IP-address validation. Although there are still hard drive collections prepared from Products.
IP-address validation is supported by adding to a Product sets of address ranges organized by organizations, like UCB, TUMS etc. When the migration ontology is normalized these sets of address ranges should be given their own resource outside of the Product resource since the model of institutional subscriptions is being changed so that it is not Product centric.
Products have decs:contents rather than a :name (but this can be added during normalization). 
The various Works that are contained in the Product include an inProduct element that refers back to the Product. So to find all the Works that are contained in a Product is a simple query. to locate all Works that refer to a given Product via wrk:inProduct.""" ;
  rdfs:comment "Names are represented by rdfs:label"@en ;
  rdfs:label "Product"@en ;
  rdfs:subClassOf adm:Entity ; 

This doesn't come from the ontDocumentManager as a single ttl Model (ontDoc provides OntologyModels, and nothing else that lies within that ontology - as a Resource Model or Ontology Class Model)

Therefore, along with that, there is a mechanism in ldspdi that allows for getting all the info you have at http://purl.bdrc.io/ontology/admin/Product (i.e individuals, root class and inherited props, properties with range, etc...)

Second point:

As I just said, OntDocumentManager, imports and so on are not the issue here. All things related to separate ontologies loading and caching as ttl have been already addressed for quite some time for our core, auth, admin and -recently- shapes ontologies and sub-ontologies.

The issue is about serving Node resources belonging to a given ontology as ttl. So serving http://purl.bdrc.io/shapes/core/PersonShapes/ is not an issue ( http://ldspdi-dev.bdrc.io/shapes/core/PersonShapes.ttl ).

Extracting the model for the resource http://purl.bdrc.io/ontology/shapes/core/PersonShape is what I am talking about. There is not ttl file associated to it that can be loaded. It has to be extracted from the ontology model of the ontology it belongs to. So far we have done that as described above for the html service and for core ontologies (adm, core, auth) but not as model serialized in ttl. We need that so we can use shacl shapes (defined as or within ontologies) within the editor client.

For instance, when getting the best shape for a Person, we might need something like a single shape, a sh:NodeShape, for instance http://purl.bdrc.io/ontology/shapes/core/PersonShape) instead of the whole ontology. However we have to be able to serve Ontologies as a whole as well as individuals shapes. it will be the case for all ontologies (core or shapes) once this issue is resolved. I am on it.

xristy commented 4 years ago

Sorry to be so dense. I was misunderstanding "entire ontology without imports" vs ontology resource.

TopQ_GetShape loads a file like PersonShapes_BASE.ttl (which is a copy of editor-templates/templates/core/person.shapes.ttl) and then extracts bds:PersonShape from the shapesModel and writes it as TTL which is what I understand you are wanting.

MarcAgate commented 4 years ago

I think that we should avoid using Jena Ontology terminology when talking about shacl shapes as defined and loaded in editor/templates. This is misleading since we load ontologies as OntModel when we actually just load a Model since these shapes "ontologies", loaded this way, are not made of ontology classes. Therefore, the whole Jena OntologyModel framework is not applicable and is useless.

Moreover, I realize that it's been so misleading that we undertook to serve these shapes as ontologies, using the same framework as the one we developed for core, auth and admin, which are real ontologies, with classes, ont props, domain, ranges, individuals and so on. It doesn't make sense at all.

Shapes should be served as resources and have a dedicated endpoint in the same way we have purl.bdrc.io/resource/XXXX for bdr:resources. There should be a way to organize a set of named graphs for all shapes using the current modular loading system, based on Jena OntDocument Manager.

Edit: The whole shape service should therefore be implemented in editserv, apart from the ldspdi Ontology service, for the sake of clarity and simplicity. In practice, most if not all the concepts that drive the Ontology service implementation are irrelevant for shacl shape service.

MarcAgate commented 4 years ago

We have a first implementation working.

http://ldspdi-dev.bdrc.io/ontology/shapes/core/InstanceShape.ttl

If you all agree it makes much more sense to load that apart our Ontology service and to serve shacl shapes using a dedicated service on editserv, then I 'll move the loading code and a few other things so we'll have clearer code and simpler behaviors on both sides (ldspdi and editserv)

xristy commented 4 years ago

1) did the TopQ_GetShape solve the issue of serving the resource via TTL?

2) I completely disagree with

This is misleading since we load ontologies as OntModel when we actually just load a Model since these shapes "ontologies", loaded this way, are not made of ontology classes. Therefore, the whole Jena OntologyModel framework is not applicable and is useless.

The use of editor-templates/ont-policy.rdf to manage shapes modules is completely appropriate. An ontology language like OWL or SHACL are alternative ways of expressing bits of information about what constraints, restrictions and so on apply to Resources.

How you choose to serve whatever, whether in ldspdi or editserv is a separate issue from the use of tools such as ont-policy.rdf, OntDocumentManager and so on.

I've organized the shapes information (which we are using together with a limited set of features from OWL) in a modular manner using ontolgy framework principles.

I don't think the approach to representing shapes information impacts the rest of your considerations.

xristy commented 4 years ago

I could say OWL and SHACL are also complementary since OWL expresses notions under the Open World Assumption and SHACL is intended to make Closed World statements.

MarcAgate commented 4 years ago

This is the second time you misread me: I have never said we should get rid of The OntDocumentManager modular loading system. I believe I have said the opposite:

There should be a way to organize a set of named graphs for all shapes using the current modular loading system, based on Jena OntDocument Manager.

I just wrote however that shacl shapes are not and don't need to be ontologies (because they are Closed Word documents, precisely) even though they are loaded as OntOntology models and that the Jena OntModel class is completely useless for handling them. Look: you went back yourself to Model.listStatements() to get something out of these "ontologies". I just encountered the same issue before and solved it like this: https://github.com/buda-base/lds-pdi/blob/newbiblio/src/main/java/io/bdrc/ldspdi/rest/resources/PublicDataResource.java#L616

This is what led me to think that we have to separate these two services even though they are loaded using the same OntDocumentManager based mechanism.

xristy commented 4 years ago

I just wrote however that shacl shapes are not and don't need to be ontologies (because they are Closed Word documents, precisely) even though they are loaded as OntOntology models and that the Jena OntModel class is completely useless for handling them.

I simply disagree w/ your assertions probably because I'm misreading for a 3rd time.

MarcAgate commented 4 years ago

as of commit e4bcd72 , we are serving both shapes named graphs (as defined in the OntologySpec resources of the OnPolicy.rdf file of editor-templates repo) for instance : http://ldspdi-dev.bdrc.io/shapes/core/PersonUIShapes

and

shapes resources (as defined inside shape named graphs or shapes.ttl files). for instance : http://ldspdi-dev.bdrc.io/ontology/shapes/core/LccnShape

This service has been isolated from normal OntologyService (which actually manages ont class hierarchy, domains, ranges and all kind of Ontology properties (Annotation, Transitive, Object, Data, etc..) and specifics that are not relevant for shacl shapes as we use them. However, and for (obvious) url pattern reasons (all shapes uri are rooted in purl.bdrc.io), the service is implemented on ldspdi.

Next step is now to extend this resource service to Ontologyresource (for instance serving http://purl.bdrc.io/ontology/core/SerialWork.ext , where ext is a supported jena extension. Should be straightforward now.

MarcAgate commented 4 years ago

as of commit d401473, we are serving any resource from any ontology as a serialized rdf resource.

for instance (auth): http://ldspdi-dev.bdrc.io/ontology/ext/auth/Permission.ttl for instance (admin): http://ldspdi-dev.bdrc.io/ontology/admin/place_TLM_admin.ttl for instance (core): http://ldspdi-dev.bdrc.io/ontology/core/personIsNonActor.ttl