Closed xristy closed 4 years ago
I should mention that the translation triples are in bdg:trans_core_bo
. So there's a question of how that graph is getting updated on fuseki when a change is pushed in
owl-schema/translations/core_bo.ttl
You have to specify the graph in ontPolicy.rdf (ontGraph property) so it will normaly put this ont data in this graph.
I delete the ontologySchema graph and ran the loading process. However, I still have the triple you mentioned above. It must be somewhere in a file loaded by the OntDocManager.
the triple among similar ones is in owl-schema/translations/core_bo.ttl
.
The owl-schema/ont-policy.rdf has:
<OntologySpec>
<!-- local version of the Admin translations vocabulary -->
<publicURI rdf:resource="http://purl.bdrc.io/ontology/translations/CoreBo/"/>
<altURL rdf:resource="owl-schema/translations/core_bo.ttl"/>
<altURL rdf:resource="https://raw.githubusercontent.com/buda-base/owl-schema/master/translations/core_bo.ttl"/>
</OntologySpec>
so I would assume it would be loaded into bdg:ontologySchema
since that is:
<adm:defaultOntGraph rdf:resource="http://purl.bdrc.io/graph/ontologySchema"/>
I do not know how bdg:trans_core_bo
comes from or whether it's needed now or not. @eroux ?
The translation triples are also loaded into bdg:ontologySchema as makes sense from the ont-policy.rdf
So now we know that bdg:trans_core_bo
is not needed and it has been droipped by @MarcAgate from fuseki corerw.
It is sufficient to update repo ontology-translation
and the owl-schema/translations files
@MarcAgate I just pushed commit 27e583 to editor-templates and bdg:PersonLocalShapes
was updated but bdg:PersonShapes
didn't get completely updated.
It looks like the prior version of person.local.shapes.ttl
got merged with the pushed version of person.shapes.ttl
.
The nature of the changes is such that I can't tell what happened w/ person.ui.shapes.ttl
since it is an accumulation of person.shapes.ttl
and person.local.shapes.ttl
which makes it hard to tell whether triples were multiple times from several files.
I also tried:
curl -H "Content-Type: application/json" -X POST "http://purl.bdrc.io/callbacks/github/editor-templates"
but that just gives:
{"timestamp":"2020-06-05T21:40:03.839+0000","status":405,"error":"Method Not Allowed","message":"Request method 'POST' not supported","path":"/callbacks/github/editor-templates"}
I put the full log of the update here below the first part (OntPolicy for uri .... etc ) gives you the relation ship between graphs and files according to ontPolicy.rdf.
The end gives you the update of each graph and its size (dont worry about the "InfModel Size" display, as it is wrong since there is no inference here). Hopefully you'll be able to trace things from that. I don't know about the correct (or expected) imports/merge etc.
I'm not seeing any obvious problems in the log that you sent. The log information doesn't show the importing or file dates and so on that would be needed to see what is getting merged in ldspdi.
I realize that my note from yesterday evening didn't provide any details that might be needed in tracking stuff down so I'll recount them below:
Yesterday evening's commit 27e583f was in part about moving UI related triples from person.local.shapes.ttl
and person.shapes.ttl
to person.ui.shapes.ttl
. These were triples with predicates sh:name
, sh:description
, and dash:editor
. From the person.local.shapes.ttl
section at the top of the commit are the following:
sh:description "this Person may have one or more names."@en ;
from bds:PersonShape-personName
sh:description "this Person has a name given by the label."@en ;
from bds:PersonNameShape-personNameLabel
sh:description "this Person may have zero or more events like birth, death, ordination."@en ;
from bds:PersonShape-personEvent
dash:editor dash:InstancesSelectEditor ;
and L#167 sh:name "role associated with the event"@en ;
from bds:PersonEventShape-personEventRole
sh:description "this Person may have at most one Gender or none if not known."@en ;
from bds:PersonShape-personGender
Checking http://purl.bdrc.io/shapes/core/PersonLocalShapes confirms that these triples are not present from ldspdi and running an appropriate query on fuseki (construct over bdg:PersonLocalShapes
) shows they are also not present in corerw in this graph. This is as expected based on the commit.
Note/ the response to http://purl.bdrc.io/shapes/core/PersonLocalShapes/ does include occurrences of the predicates sh:name
, sh:description
, and dash:editor
; however, those are coming from root.shapes.ttl
and event.shapes.ttl
which I have not yet factored - that was my next task./Note
The problem is that when visiting http://purl.bdrc.io/shapes/core/PersonShapes the six triples identified above are present in the result and with the appropriate query (construct over bdg:PersonShapes
) on fuseki - so ldspdi is serving up the same thing that appears on fuseki but as far as I can tell ldspdi has loaded bdg:PersonShapes
content with triples that are not there in GH.
Note/ The import chain is:
PersonShapes
PersonLocalShape
EventShapes
RootShapes
BaseShapes
/Note
It is relevant to note that the various triples that were in person.shapes.ttl
prior to this latest commit and which I moved to person.ui.shapes.ttl
are indeed not present in bdg:PersonShapes
on fuseki or in the response to visiting http://purl.bdrc.io/shapes/core/PersonShapes which is as expected.
To make matters much more odd there is content present in bdg:PersonShapes
- on fuseki and from ldspdi - that was removed from persons.local.shapes.ttl
via commit b44404 on 3 June, 4 commits prior to last evening's commit. Specifically:
bds:PersonEventShape-personEventType
a sh:PropertyShape ;
dash:editor dash:InstancesSelectEditor ;
sh:class bdo:PersonEventType ;
sh:maxCount 1 ;
sh:message "exactly one PersonEventType required"@en ;
sh:minCount 1 ;
sh:name "role associated with the event"@en ;
sh:path bdo:eventType .
and:
bds:PersonEventShape a sh:NodeShape ;
rdfs:label "Person Event Shape"@en ;
bds:nodeShapeType bds:FacetShape ;
sh:property bds:PersonEventShape-personEventCorporation , bds:PersonEventShape-personEventRole , bds:PersonEventShape-personEventType ;
sh:targetClass bdo:PersonEvent .
The bds:PersonEventShape-personEventCorporation
was moved to PersonShapes
in a still earlier commit 8ca252 from 2 June and so is expected to be in PersonShapes
, but the reference to bds:PersonEventShape-personEventType
should not be there.
I have added OntTestLoading3.java
, using juist OntDocumentManager
, to shapes-testing
repo. It loads PersonLocalShapes
and PersonShapes
per editor-templates/ont-policy.rdf
, processing imports, and then writing out both the model from the file, person.local.shapes.ttl
and person.shapes.ttl
, and the aggregate models from processing imports.
The results are as expected based on the master branch of editor-templates. There are no stale triples appearing in the PersonShapes
full model/graph.
If you want to run it you'll need to change L18 to reflect where you want the output files written.
I do not know what more I can do at this point. The evidence seems to point to something in ldspdi. The GH content is as intended. The OntDocumentManager
w/ editor-templates/ont-policy.rdf
appears to produce the expected results. Also it seems that ldspdi is updating newcorerw
in the same manner as corerw
.
@MarcAgate on 9 June @ 19:55Z, I pushed commit bdcb02 - no commits since then.
This commit deleted
sh:description "Zero or more notes may be associated with an entity"@en ;
from the defn of bds:EntityShape-note
in root.local.shapes.ttl. That triple is still present in bdg:PersonShapes
and bdg:PersonLocalShapes
when retrieving via graph uri from lds-pdi or via construct on fuseki..
The commit also added 25 occurrences of sh:message
in root.local.shapes.ttl. The message triples occur mostly in bds:NoteShape-...
and bds:ContentLocationShape-...
property shape defns. None of the added sh:messages
appear in bdg:PersonShapes
, bdg:PersonLocalShapes
, and bdg:PersonUIShapes
. All of the sh:messages
appear in bdg:shapesSchema
which is the adm:defaultOntGraph
.
Please refer to the README.md for updated info on the import patterns.
It is also worth noting that the triples that were deleted but still appearing in bdg:PersonShapes
from the comment just above are now no longer present as was to be expected from commit 27e583f.
There is also an anomaly in bdg:shapesSchema
. In root.local.shapes.ttl
is
bds:EntityShape-skos_prefLabel
a sh:PropertyShape ;
sh:message "each Entity resource must have at least one skos:prefLabel and each must be a unique language"@en ;
sh:path skos:prefLabel ;
sh:datatype rdf:langString ;
sh:languageIn (
"en" "zh" "bo" "bo-x-ewts" "km-x-femc" "km" "fr" "km-x-bdrc"
) ;
sh:minCount 1 ;
sh:uniqueLang true ;
.
and in all the adm:ontGraph
named graphs there is a single occurrence of
sh:languageIn (
"en" "zh" "bo" "bo-x-ewts" "km-x-femc" "km" "fr" "km-x-bdrc"
) ;
but in bdg:shapesSchema
there are two occurrences as can be seen from:
select ?s ?p ?o ?g
where {
bind (sh:languageIn as ?p)
graph ?g { ?s ?p ?o . }
} limit 3000
and from:
construct { ?s ?p ?o . }
where {
bind (bdg:shapesSchema as ?g)
graph ?g { ?s ?p ?o . }
} limit 3000
and then looking at the defn:
bds:EntityShape-skos_prefLabel
a sh:PropertyShape ;
sh:datatype rdf:langString ;
sh:description "require unique language from among the listed choices"@en ;
sh:languageIn ( "en" "zh" "bo" "bo-x-ewts" "km-x-femc" "km" "fr" "km-x-bdrc" ) ;
sh:languageIn ( "en" "zh" "bo" "bo-x-ewts" "km-x-femc" "km" "fr" "km-x-bdrc" ) ;
sh:message "each Entity resource must have at least one skos:prefLabel and each must be a unique language"@en ;
sh:minCount 1 ;
sh:name "pref label"@en ;
sh:order "1"^^xsd:decimal ;
sh:path skos:prefLabel ;
sh:uniqueLang true .
At first I wasn't sure whether this was a problem with Jena so I added OntTestLoading4 to see if I could reproduce the double occurrence via OntDocumentManager
, but it seems that the double occurrence, apparently, owing to distinct blank nodes may be in lds-pdi.
A pattern may be emerging: changes to shapes file A
appear in bdg:A
(if defined). If B
imports A
then the changes in A
do not appear in bdg:B
(if defined) or subsequent imports of B
(if any).
Hopefully, this will help pinpoint where the problem in lds-pdi is.
I have prepared basic questions with yes/no answer (please explain very briefly is the answer is a "No") - These questions apply to the current state of the editor-templates repo. ( commit 8d5dd4a )
http://purl.bdrc.io/graph/WorkShapes
http://purl.bdrc.io/graph/PersonUIShapes
http://purl.bdrc.io/graph/PersonLocalShapes
http://purl.bdrc.io/graph/PersonShapes
http://purl.bdrc.io/graph/ItemShapes
http://purl.bdrc.io/graph/InstanceShapes
http://purl.bdrc.io/graph/CorporationShapes
http://purl.bdrc.io/graph/shapesSchema
1) Are the graph in the above list, designated by their uris, the only graphs we should find on fuseki? 2) Is http://purl.bdrc.io/graph/shapesSchema supposed to be a merge of all other grapghs in the list?
http://purl.bdrc.io/shapes/core/IdentifierShapes/
http://purl.bdrc.io/shapes/core/EventUIShapes/
http://purl.bdrc.io/shapes/core/EventLocalShapes/
http://purl.bdrc.io/shapes/core/EventShapes/
http://purl.bdrc.io/shapes/core/RootUIShapes/
http://purl.bdrc.io/shapes/core/RootLocalShapes/
http://purl.bdrc.io/shapes/core/RootShapes/
http://purl.bdrc.io/shapes/adm/AdminShapes/
http://purl.bdrc.io/shapes/core/BaseShapes/
3) The ontologies corresponding to the uris above do not have an individual graph in fuseki. Is this correct? 4) This data is dispatched into the graphs of the first list (and pushed to fuseki) par the sole magic of the import feature of theOntDocumentManager. Is this correct ?
All the models as loaded and generated by the OntDocument from OntPolicy.rdf (i.e from the list of documents read from OntPolicy.rdf) have been saved as ttl in https://github.com/buda-base/lds-pdi/tree/master/src/test/resources/ttl/shapes
5) Are these correct ?
I think I'll be able to move further along my debug path once I have answers to these 5 questions.
http://purl.bdrc.io/graph/WorkShapes http://purl.bdrc.io/graph/PersonUIShapes http://purl.bdrc.io/graph/PersonLocalShapes http://purl.bdrc.io/graph/PersonShapes http://purl.bdrc.io/graph/ItemShapes http://purl.bdrc.io/graph/InstanceShapes http://purl.bdrc.io/graph/CorporationShapes http://purl.bdrc.io/graph/shapesSchema
- Are the graph in the above list, designated by their uris, the only graphs we should find on fuseki?
Yes
- Is http://purl.bdrc.io/graph/shapesSchema supposed to be a merge of all other grapghs in the list?
Yes
http://purl.bdrc.io/shapes/core/IdentifierShapes/ http://purl.bdrc.io/shapes/core/EventUIShapes/ http://purl.bdrc.io/shapes/core/EventLocalShapes/ http://purl.bdrc.io/shapes/core/EventShapes/ http://purl.bdrc.io/shapes/core/RootUIShapes/ http://purl.bdrc.io/shapes/core/RootLocalShapes/ http://purl.bdrc.io/shapes/core/RootShapes/ http://purl.bdrc.io/shapes/adm/AdminShapes/ http://purl.bdrc.io/shapes/core/BaseShapes/
- The ontologies corresponding to the uris above do not have an individual graph in fuseki. Is this correct?
Yes
- This data is dispatched into the graphs of the first list (and pushed to fuseki) par the sole magic of the import feature of theOntDocumentManager. Is this correct ?
Yes, except that OntDocumentManager
does not push to fuseki.
All the models as loaded and generated by the OntDocument from OntPolicy.rdf (i.e from the list of documents read from OntPolicy.rdf) have been saved as ttl in https://github.com/buda-base/lds-pdi/tree/master/src/test/resources/ttl/shapes
- Are these correct ?
I believe so. I closely checked AdminShapes.ttl
, BaseShapes.ttl
, and PersonShapes.ttl
, and skimmed the others.
I think I'll be able to move further along my debug path once I have answers to these 5 questions.
Sounds good.
A combination of OntDocumentManager caching settings and browser caching misbehaviors (do not trust empty caches and the like). Only safe testing is via curl and s-query
But wait! There's more.
I pushed commit af0c1b6: finished adding sh:message to event.local.shapes.ttl; no need to import dash except in root.ui.shapes.ttl. I.e., there were unneeded owl:imports <http://datashapes.org/dash> ;
.
Marc verified that the GH webhook fired and ldspdi loaded the files fresh from GH/editor-templates; and the files are saved in buda1:/usr/local/ldspdi/.
grep "<http://datashapes.org/dash>" *.ttl
shows that the imports still remain in the following files even though they are not present in GH:
BaseShapes.ttl: owl:imports <http://datashapes.org/dash> ;
EventShapes.ttl: owl:imports <http://purl.bdrc.io/shapes/core/RootShapes/> , <http://purl.bdrc.io/shapes/core/EventLocalShapes/> , <http://datashapes.org/dash> ;
EventUIShapes.ttl: owl:imports <http://purl.bdrc.io/shapes/core/RootUIShapes/> , <http://purl.bdrc.io/shapes/core/EventShapes/> , <http://datashapes.org/dash> ;
PersonLocalShapes.ttl: owl:imports <http://purl.bdrc.io/shapes/core/EventLocalShapes/> , <http://datashapes.org/dash> ;
PersonShapes.ttl: owl:imports <http://purl.bdrc.io/shapes/core/PersonLocalShapes/> , <http://purl.bdrc.io/shapes/core/EventShapes/> , <http://datashapes.org/dash> ;
RootLocalShapes.ttl: owl:imports <http://purl.bdrc.io/shapes/core/BaseShapes/> , <http://datashapes.org/dash> ;
RootShapes.ttl: owl:imports <http://purl.bdrc.io/shapes/core/RootLocalShapes/> , <http://datashapes.org/dash> ;
and the other files from which the imports were removed do not retain the dash import:
AdminShapes.ttl
CorporationShapes.ttl
EventLocalShapes.ttl
IdentifierShapes.ttl
InstanceShapes.ttl
ItemShapes.ttl
WorkShapes.ttl
Further, running:
s-query --query TEST_SPARQL_001.txt --server http://buda1.bdrc.io:13180/fuseki/corerw/query > TEST_OUTPUT/ALL_SHAPES08.ttl
with the query file TEST_SPARQL_001.txt yields
ALL_SHAPES08.ttl which shows that the bdg:shapesSchema
contains the triples that were removed. Using the commandline s-query
avoids any question of browser caching when using the fuseki web i/f.
Substituting bdg:PersonLocalShapes
in the query file and running s-query
produces
PersonLocalShapes_ALL09.ttl with an occurrence of <http://datashapes.org/dash>
in each of BaseShapes:
, RootLocalShapes:
, and PersonLocalShapes
.
Running curl GET "http://purl.bdrc.io/shapes/core/PersonLocalShapes"
produces the same result as s-query
via ldspdi.
OTOH, running OntTestLoading4
in shapes-testing produces results equivalent to GH contents:
PersonLocalShapes_ALL07.ttl. There are no occurrences of <http://datashapes.org/dash>
.
The issue still remains and is no tied to cache behavior in web browsers.
Here's a proposal that will allow a bit more debugging: we could use the same graph data as in other graphs, and have something like:
bda:shapesSchema a adm:AdminData ;
adm:gitRevision "xxx" ; # the git revision of the editors-template repo
adm:graphId bdg:shapesSchema ;
adm:gitRepo bda:GR0010 ; # or whatever, a new git repo individual for the editors-template
.
what do you think?
I also think we should always have a local git repo and that the webhooks just do a pull + reimport of the local repo. That way we'll avoid other caches such as the github download URLs.
@xristy
More (hopefully useful) remarks here:
Files under /usr/local/ldspdi are serialization of the models returned by the docManager. These are written while looping/reading over the list of documents built by the Docmanager from OntPolicy.rdf.
If we have correct files (with imports removed) and incorrect files (with imports not being removed), within the same loop, then it means that the issue lies at the level of DocManager.
However, unless I am mistaken, the code loading the model and producing the files in ldspdi is exactly the same as the one you have in OntTestLoading4.
Furthermore, without changing anything to the code, I just restarted ldspi and all the produced files are now as expected.
So it's not the config of DocManager nor it can be the code that runs before pushing to fuseki and produces the ttl serialization files we used for debugging. How can you be sure it's not still a cache issue ?
Let's work with the hypothesis that github doesn't update the download URLs fast enough and when fetching them just after a push, we don't always get the latest version of files. A workaround is to implement my two ideas. Unless there is another hypothesis of course.
Sounds good to me as the hypothesis you describe might actually be the case and using a local git repo thing will obviously solve the issue.
@MarcAgate I'm not sure it is isn't a cache issue. I am sure that it isn't a web browser cache issue since my tests were w/ curl
and s-query
.
@eroux your hypothesis about a timing issue makes sense. I'm not sure though. If the webhook fires before GH has completed storing the push updates how a pull from GH to a buda1 local repo will work better than reading via ldspdi/OntDocumentManager, unless git pull has some interaction with a push in progress on GH that fetching from GH url doesn't.
I opened a ticket with GH
From: GitHub support@githubsupport.com Subject: [GitHub Support] Confirmation - Request Received (#730928) Date: June 15, 2020 at 10:07:32 AM CDT To: Chris Tomlinson ct@moonvine.org Reply-To: GitHub support@githubsupport.com
//Please do not write below this line// Chris,
Thank you for contacting GitHub Support. We wanted to let you know that we've received your message. In order to respond to tickets with the greatest urgency as quickly as possible during the COVID-19 crisis, we've established a priority order. If you have questions regarding our recent announcement that makes most GitHub features accessible to our community free of charge, we have captured answers to common questions here.
Ticket ID: 730928
This email is a service from GitHub Support. [YDEM8W-3V07]
My Message
We have a webhook push event configured for https://github.com/buda-base/editor-templates. When our server receives:
https://purl.bdrc.io/callbacks/github/shapes (push)
then our server retrieves files of interest via raw.githubusercontent, like:
https://raw.githubusercontent.com/buda-base/editor-templates/master/templates/core/work.shapes.ttl
Sometimes we appear to end up with "old" content rather than the new pushed content.
Is there possibly a lag between when the push event is signaled and when the content fetched via raw.githubusercontent is up-to-date w/ push to the repo?
I have not been able to find any information regarding when contents retrieved via raw.githubusercontent are guaranteed to be in sync with the GH repo content as seen via git commands like git pull.
Thanks
@MarcAgate have you seen Best way to fetch content via API without hitting cache?
Nice ! That talks about a guthub cache, and therefore might explains what we are experiencing. However, I think using the API is not applicable in our case since the actual download is made by the OntDocumentManager using urls coming from the OntPolicy.rdf.
It turns out that, exploring the api.github.com w/ curl, I see that the request to GET a file redirects to raw.githubusercontent.com
.
For our information, I copy that here:
Hi Chris,
Thanks for reaching out and sorry for the delay in getting back to you on this.
Yeah, the contents returned from raw.githubusercontent.com might be stale since it's cached by our CDN.
You shouldn't really be using that for programmatic access.
If you programmatically fetching file contents, you should be using the API:
http://developer.github.com/v3/
The API has well defined rate-limiting and caching behavior you can rely on. The raw.githubusercontent.com endpoint doesn't, so you might get limited or see cached content without warning.Hope this helps.
I believe this issue is now resolved due to the last changes that occurred yesterday, following a n-th issue with synchronization. The thing is actually that there are two caches being used in the process : The OntDocument Manager cache and the cache of this OntDocumentManager FileManager. Both are now resetted each time the webhook is triggered, as follows:
odm.setCacheModels(false);
odm.getFileManager().resetCache();
There is no issue with lds-queries webhook since it is not Ontology related and not using the OntDocManager machinery.
Pushing commits to owl-schema and editor-templates do not reliably update fuseki.
I pushed a commit to owl-schema yesterday, a8885c removing the unused UNKNOWNS and it did not get reflected on fuseki.
I tried:
but that fails now
it's been in buda2 db-load to ensure fuseki is up-to-date on the ontologies.
Next I tried:
and that seemed to work for that push. Then I pushed a second commit, 737fb5, to clean-up UNKNOWNS from translations, but that didn't take hold so I tried clearcache again and that didn't work and I tried a bogus commit which did not work either.
The translation triples like:
are still in fuseki at this point.
I need both owl-schema and editor-templates to reliably update to help w/ developing and debugging.