snikproject / ontology

Public SNIK Ontology. An ontology of information management in hospitals.
https://snikproject.github.io/ontology/
Other
10 stars 1 forks source link

Investigate page and chapter statements #296

Closed KonradHoeffner closed 5 years ago

KonradHoeffner commented 5 years ago

We only seem to have sparse page and chapter data. Investigate how many we have and in which form and if we lost any through the remodels.

KonradHoeffner commented 5 years ago
property count
http://www.snik.eu/ontology/he/page 2131
http://www.snik.eu/ontology/bb/page 1146
http://www.snik.eu/ontology/it4it/page 18
http://www.snik.eu/ontology/ob/page 774
relation class count
http://www.snik.eu/ontology/it4it/page http://www.w3.org/2002/07/owl#NamedIndividual 7
http://www.snik.eu/ontology/he/page http://www.w3.org/2002/07/owl#Class 1927
http://www.snik.eu/ontology/it4it/page http://www.w3.org/2002/07/owl#Class 18
http://xmlns.com/foaf/0.1/homepage http://www.w3.org/2002/07/owl#Ontology 7
http://www.snik.eu/ontology/bb/page http://www.w3.org/2002/07/owl#Class 1124
http://www.snik.eu/ontology/ob/page http://www.w3.org/2002/07/owl#DatatypeProperty 1
http://www.snik.eu/ontology/he/page http://www.w3.org/2002/07/owl#Axiom 197
http://www.snik.eu/ontology/ob/page http://www.w3.org/2002/07/owl#Class 774
http://www.snik.eu/ontology/bb/page http://www.w3.org/2002/07/owl#ObjectProperty 1
http://www.snik.eu/ontology/he/page http://www.w3.org/2002/07/owl#ObjectProperty 7
http://www.snik.eu/ontology/bb/page http://www.w3.org/2002/07/owl#DatatypeProperty 21
select ?p as ?relation ?c as ?class count(distinct * ) as ?count
{
 ?x ?p ?y.
 ?x a ?c.
 filter(REGEX(STR(?p),"page"))
} group by ?p ?c
KonradHoeffner commented 5 years ago

In the old bb.rdf from the repository (last update: 2017-08-01), there are only 494 occurrences of page, see grep "page>[^<]" bb.rdf | wc -l. This is because the SPARQL query also counts empty page statements, which should be removed. In the meantime, here is the count for nonempty pages:

relation class count
http://www.snik.eu/ontology/it4it/page http://www.w3.org/2002/07/owl#NamedIndividual 7
http://www.snik.eu/ontology/he/page http://www.w3.org/2002/07/owl#Class 1927
http://www.snik.eu/ontology/it4it/page http://www.w3.org/2002/07/owl#Class 18
http://xmlns.com/foaf/0.1/homepage http://www.w3.org/2002/07/owl#Ontology 7
http://www.snik.eu/ontology/bb/page http://www.w3.org/2002/07/owl#Class 482
http://www.snik.eu/ontology/he/page http://www.w3.org/2002/07/owl#Axiom 197
http://www.snik.eu/ontology/ob/page http://www.w3.org/2002/07/owl#Class 1
http://www.snik.eu/ontology/bb/page http://www.w3.org/2002/07/owl#ObjectProperty 1
http://www.snik.eu/ontology/he/page http://www.w3.org/2002/07/owl#ObjectProperty 7
http://www.snik.eu/ontology/bb/page http://www.w3.org/2002/07/owl#DatatypeProperty 19
select ?p as ?relation ?c as ?class count(distinct * ) as ?count
{
 ?x ?p ?y.
 ?x a ?c.
 filter(?y!="").
 filter(REGEX(STR(?p),"page"))
} group by ?p ?c

And for comparison the empty ones:

relation class count
http://www.snik.eu/ontology/bb/page http://www.w3.org/2002/07/owl#Class 642
http://www.snik.eu/ontology/ob/page http://www.w3.org/2002/07/owl#DatatypeProperty 1
http://www.snik.eu/ontology/ob/page http://www.w3.org/2002/07/owl#Class 773
http://www.snik.eu/ontology/bb/page http://www.w3.org/2002/07/owl#DatatypeProperty 2
KonradHoeffner commented 5 years ago

There don't seem to be any pages for triples, just for classes and relations. Maybe they are mistakenly placed in the relations? How many are there at maximum for a single subject?

select ?x count(?y)
{
 ?x bb:page ?y.
 filter(?y!="").
} order by desc(count(?y))

Result: At most two per subject, which looks fine.

Next Step: Look at the old extraction table of bb to find out if there are page statements that are missing now.

KonradHoeffner commented 5 years ago

They were there all along in the spreadsheet, all dumps in the repository and on the SPARQL endpoint. It just wasn't found because the property name is TripelPage.

select ?p as ?relation ?c as ?class count(distinct * ) as ?count
{
 ?x ?p ?y.
 ?x a ?c.
 filter(?y!="").
 filter(REGEX(STR(?p),"page"),"i")
} group by ?p ?c
relation class count
http://www.snik.eu/ontology/bb/TripelPage http://www.w3.org/2002/07/owl#Axiom 2446
http://www.snik.eu/ontology/it4it/page http://www.w3.org/2002/07/owl#NamedIndividual 7
http://www.snik.eu/ontology/meta/DefinitionDEPage http://www.w3.org/2002/07/owl#Class 310
http://www.snik.eu/ontology/he/page http://www.w3.org/2002/07/owl#Class 1927
http://www.snik.eu/ontology/it4it/page http://www.w3.org/2002/07/owl#Class 18
http://xmlns.com/foaf/0.1/homepage http://www.w3.org/2002/07/owl#Ontology 7
http://www.snik.eu/ontology/bb/page http://www.w3.org/2002/07/owl#Class 482
http://www.snik.eu/ontology/he/page http://www.w3.org/2002/07/owl#Axiom 197
http://www.snik.eu/ontology/ob/TripelPage http://www.w3.org/2002/07/owl#Axiom 2892
http://www.snik.eu/ontology/ob/page http://www.w3.org/2002/07/owl#Class 1
http://www.snik.eu/ontology/bb/page http://www.w3.org/2002/07/owl#ObjectProperty 1
http://www.snik.eu/ontology/he/page http://www.w3.org/2002/07/owl#ObjectProperty 7
http://www.snik.eu/ontology/bb/page http://www.w3.org/2002/07/owl#DatatypeProperty 19
KonradHoeffner commented 5 years ago

Are the axiom still bound to valid subjects and objects?

Get an overview via:

select *
from sniko:bb
from sniko:ob
{
 ?x bb:TripelPage|ob:TripelPage ?page.
 ?x ?p ?o.
}
x page p o
nodeID://b207595 "136"^^http://www.w3.org/2001/XMLSchema#string http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2002/07/owl#Axiom
nodeID://b207612 "155, 164"^^http://www.w3.org/2001/XMLSchema#string http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2002/07/owl#Axiom
nodeID://b207617 "130"^^http://www.w3.org/2001/XMLSchema#string http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2002/07/owl#Axiom
nodeID://b207622 "137"^^http://www.w3.org/2001/XMLSchema#string http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2002/07/owl#Axiom
nodeID://b207624 "137"^^http://www.w3.org/2001/XMLSchema#string http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2002/07/owl#Axiom
nodeID://b207636 "127"^^http://www.w3.org/2001/XMLSchema#string http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2002/07/owl#Axiom

...

Hard to see, rewrite blank nodes to so that they are in the namespace of their subontology and can be viewed via LodView.

KonradHoeffner commented 5 years ago

rename blank nodes to "real" nodes

blank nodes in subject position

sparql
select ?g count(*)
{
 graph ?g
 {
  ?x ?p ?o.
  FILTER(REGEX(STR(?x),"nodeID://"))
 }
} group by ?g
gANY callret-1ANY
http://www.w3.org/2002/07/owl# 6
http://www.snik.eu/ontology/ciox 200
http://www.snik.eu/ontology/meta 6
http://www.snik.eu/ontology/bb 18092
http://www.w3.org/2004/02/skos/core# 6
http://www.snik.eu/ontology/it4it 27
http://www.snik.eu/ontology/ob 24368

blank nodes in object position

select ?g count(*)
{
 graph ?g
 {
  ?x ?p ?o.
  FILTER(REGEX(STR(?o),"nodeID://"))
 }
} group by ?g
gANY callret-1ANY
http://www.w3.org/2002/07/owl# 3
http://www.snik.eu/ontology/meta 2
http://www.w3.org/2004/02/skos/core# 3
http://www.snik.eu/ontology/it4it 3
http://www.snik.eu/ontology/ob 2338
http://www.snik.eu/ontology/bb 1139
KonradHoeffner commented 5 years ago

Rename bb subject

SPARQL
with <http://www.snik.eu/ontology/bb>
delete
{
  ?x ?p ?o.
}
insert
{
  ?y ?p ?o.
}
where
{
  ?x ?p ?o.
  FILTER(REGEX(STR(?x),"nodeID://"))
  BIND(IRI(REPLACE(STR(?x),"nodeID://b","http://www.snik.eu/ontology/bb/blank")) as ?y).
}

Analogously ob Modify <http://www.snik.eu/ontology/ob>, delete 24368 (or less) and insert 24368 (or less) triples -- done Modify <http://www.snik.eu/ontology/ciox>, delete 200 (or less) and insert 200 (or less) triples -- done

rename bb object

SPARQL
with <http://www.snik.eu/ontology/bb>
delete
{
  ?x ?p ?o.
}
insert
{
  ?x ?p ?y.
}
where
{
  ?x ?p ?o.
  FILTER(REGEX(STR(?o),"nodeID://"))
  BIND(IRI(REPLACE(STR(?o),"nodeID://b","http://www.snik.eu/ontology/bb/blank")) as ?y).
}

Modify <http://www.snik.eu/ontology/bb>, delete 1139 (or less) and insert 1139 (or less) triples -- done Modify <http://www.snik.eu/ontology/ob>, delete 2338 (or less) and insert 2338 (or less) triples -- done

KonradHoeffner commented 5 years ago

investigate whether the axioms source and targets still exist

select *
{
 ?x owl:annotatedSource|owl:annotatedTarget ?o.
 MINUS {?o a ?something.}
}

Result: 16 missing, so it seems to work in general. Separate issue: #297