buda-base / lds-pdi

http://purl.bdrc.io BDRC Linked Data Server
Apache License 2.0
2 stars 0 forks source link

problems in ontology serialization #197

Open eroux opened 3 years ago

eroux commented 3 years ago
curl -H "accept: application/n-triples" http://purl.bdrc.io/ontology/core

returns nothing while it should return the n3 syntax. Also

curl -H "accept: text/turtle" http://purl.bdrc.io/ontology/core

returns a result with a different number of lines every time, which probably indicates that some triples are missing... but without the ability to fetch the n3 syntax I can't check easily.

eroux commented 3 years ago

see https://github.com/buda-base/owl-schema/issues/172

xristy commented 3 years ago

Out of curiosity I tried the curl repeatedly:

curl -H "accept: text/turtle" http://purl.bdrc.io/ontology/core | wc -l

and the first one reported 3694 lines. Each successive curl seemed to add around 51 lines each time regardless of the amount of time in between curls.

xristy commented 3 years ago

Every call to curl is adding lines like:

         [ a             owl:Class ;
           owl:unionOf   ( iiif2:Manifest iiif3:Manifest )
         ] ;

and

         [ a             owl:Class ;
           owl:unionOf   ( bdo:Person bdo:Topic )
         ] ;

for each of 17 definitions like:

bdo:Agents  a    owl:Class ;
   rdfs:label    "Agents"@en ;
   rdfs:subClassOf  bdo:MixedUnion ;
   rdfs:subClassOf  [ a             owl:Class ;
                      owl:unionOf   ( bdo:Agent foaf:Agent )
                    ] ,

3 x 17 == 51 which accounts for that behavior.

Where are these repeated subClassOf triples coming from? They're evidently building up in some cache in lds-pdi.

eroux commented 3 years ago

how interesting...

xristy commented 3 years ago

One thing I notice is a class hierarchy bdo:Union and various subClasses such as bdo:CoreUnion, bdo:MixedUnion. I doubt these are useful.

The subClasses of items such as bdo:CoreUnion have anonymous classes such as:

   rdfs:subClassOf  [ a             owl:Class ;
                      owl:unionOf   ( bdo:Agent foaf:Agent )
                    ] ;

And our stable turtle doesn't work well with blank nodes as I recall, so I wonder whether this may have something to do with this. It would be a recent change in STTL I think. I.e., when fetching from fuseki or some other action the triples w/ blank nodes don't atch anything in some modeel and get added as new triples.

I'll see if I can find a way to define these unions w/o using blank nodes. That might "patch" things for now.

eroux commented 3 years ago

I don't remember STTL having problems with blank nodes... if so we should fix it. It's a bit hard to imagine what generates these additional triples... but yes, let's investigate, it could be anywhere

xristy commented 3 years ago

I didn't know that STTL worked with blank nodes. In any event I've skolemized the owl:unionOf cases. See owl-schema edcf9c.

xristy commented 3 years ago

With the reworked uses of owl:unionOf successive curls add owl:unionOf triples:

bdo:Agents  a    owl:Class ;
   rdfs:label    "Agents"@en ;
   owl:unionOf   ( bdo:Agent foaf:Agent ) ,
         ( bdo:Agent foaf:Agent ) ;
   adm:translationPriority  2 ;
   adm:userTooltip  "The various sorts of agents represented in the ontology, such as Persons, Organizations and so on"@en .

versus the next use of curl:

bdo:Agents  a    owl:Class ;
   rdfs:label    "Agents"@en ;
   owl:unionOf   ( bdo:Agent foaf:Agent ) ,
         ( bdo:Agent foaf:Agent ) ,
         ( bdo:Agent foaf:Agent ) ;
   adm:translationPriority  2 ;
   adm:userTooltip  "The various sorts of agents represented in the ontology, such as Persons, Organizations and so on"@en .

Now that I look closer, the rdfs:domain uses and the rdfs:Datatype uses, for example, bdo:IntegerOrString all showing increasing owl:unionOf triples.

So there is some problem surrounding owl:unionOf. This problem has been introduced relatively recently since I'm pretty confident that we have seen stable retrieval of the ontology previously via lds-pdi and we have had owl:unionOf for well over a year or two.

eroux commented 3 years ago

sure, I actually think these are rendered obsolete by SHACL and we could just remove them... but let's get to the bottom of this, it's a very weird bug

xristy commented 3 years ago

The following work nicely:

curl http://purl.bdrc.io/ontology/core.trig
curl http://purl.bdrc.io/ontology/core.ttl
curl http://purl.bdrc.io/ontology/core.rdf
curl http://purl.bdrc.io/ontology/core.jsonld

repeated calls are stable always returning the same number lines and not adding any triples.

Fwiw, n3 returns 0 triples.

only curl -H "accept: text/turtle" http://purl.bdrc.io/ontology/core adds triples on each use.

I think when there is an extension present then the path goes thru PublicDataController.java#L584.

xristy commented 3 years ago

.nt (not .n3) and .nq work

MarcAgate commented 3 years ago

Fixed as of commit a5cb61d

There was some old piece of code writing triples read from local ontology file and cached as bytes. These bytes were added to a model obtained from the new code (i.e not from a cache, but from the map containing ont models read by the OntDocument manager from the local file, at startup). Removing old code (i.e adding bytes from cache) and consistently returning the OntModel from the ontModel map of OntData class, built at startup, fixes the issue.