BetaMasaheft / Documentation

Die Schriftkultur des christlichen Äthiopiens: Eine multimediale Forschungsumgebung
3 stars 3 forks source link

SPARQL endpoint does not fully function (fuseki not updated) #2047

Open eu-genia opened 2 years ago

eu-genia commented 2 years ago

No SPARQL search is possible, meaning also that the Gender page https://betamasaheft.eu/gender is not working. I am not sure I have the time and energy to fix it so I would hide the Gender button here image LitFlow also does not work but maybe I can fix it, if not then I would take that one out too. They can be re-inserted once the past functionality is back in place.

PietroLiuzzo commented 2 years ago

The SPARQL API and endpoint, as well as the triplestore do work. There seems however not to be sufficient data stored there.

eu-genia commented 2 years ago

then as I assumed this must be related to https://github.com/BetaMasaheft/Documentation/issues/1972

PietroLiuzzo commented 2 years ago

I have checked the RDF data and there is a problem caused by the data2rdf.xslt which concatenates things of this kind https://betamasaheft.eu/https://betamasaheft.eu/PRS6064Krapf

I have identified the offending places and fixed the XSLT.

I am now setting up an environment to rebuild the RDF data and newly upload it. This will take sometime and perhaps you prefer to do it yourself instead?

It should be sufficient to

The new data will replace the old.

The other issue may be solved similarly by replacing the current files in there and updating the switch to park the non matching files to another directory. This will remediate the broken rdf alternate links. This is again easy but time consuming. Let me know if you want me to do that or you can do it.

eu-genia commented 2 years ago

It is perfectly fine if you can do it, thank you!

eu-genia commented 2 years ago

(sorry, just for information: the switch to update, so that the RDFs are sorted where they belong - which one is that?)

PietroLiuzzo commented 2 years ago

ok, I am now checking also other issues and will keep this last to optimize time.

PietroLiuzzo commented 2 years ago

gender should work now and all data is being currently updated (mss and persons are done, rest follows as needed)

PietroLiuzzo commented 2 years ago

the literature sankey chart also has issues with range indexes.... I added a s missing at line 78 of LitFlowRest.xqm

PietroLiuzzo commented 2 years ago

The RDF data in fuseki has been cleared of the above errors running the following query from a separate instance.

xquery version "3.1";

import module namespace fusekisparql = 'https://www.betamasaheft.uni-hamburg.de/BetMas/sparqlfuseki'at "xmldb:exist:///db/apps/BetMas/fuseki/fuseki.xqm";
import module namespace config = "https://www.betamasaheft.uni-hamburg.de/BetMas/config" at "xmldb:exist:///db/apps/BetMas/modules/config.xqm";

let $dataset := 'betamasaheft'
let $operation := 'DELETE'
let $triples := '?sub ?pred ?obj'
let $selector := "?sub ?pred ?obj .
     FILTER(STRSTARTS(STR(?obj), 'https://betamasaheft.eu/https://betamasaheft.eu/')) "
return
 fusekisparql:editSelection($dataset, $operation, $triples,  $selector)

the function called there is the following

declare function fusekisparql:editSelection($dataset, $InsertOrDelete, $triples, $selector) {
    let $url := $fusekisparql:port||$dataset||'/update'
    let $sparqlupdate := $config:sparqlPrefixes || $InsertOrDelete || '
{ 
  '||$triples||'
} WHERE { ' || $selector || '}'
    let $req :=
    <http:request
        http-version="1.1"
        href="{xs:anyURI($url)}"
        method="POST">
        <http:header
            name="Content-type"
            value="application/sparql-update"></http:header>
        <http:body
            media-type="text/plain">{$sparqlupdate}</http:body>
    </http:request>
    let $post := http:send-request($req)[2]
    return
        $post
};

expanded data has been downloaded in end of June and now entirely retransformed. collection.xconf applied to expanded, and data imported in exist-db 6.0.1 instance, then run following

for $file in $context
let $start-time := util:system-time()
let $rdf := try{transform:transform($file, $local:data2rdf, ())} catch * {util:log('info', $file),util:log('info', $err:description)}
 let $updateFuseki := try{updatefuseki:update($rdf, 'INSERT')} catch * {$err:description}
let $runtime-ms := ((util:system-time() - $start-time) div xs:dayTimeDuration('PT1S'))
return
'stored RDF/XML and updated fuseki in ' || $runtime-ms 

on each collection with corrected version of data2rdf.xslt. run on local exist-db 6.0.1 instance as well, tested storing to apache jena fuseki 4.5 then used to store data to bm apache fuseki instance (4.3)

eu-genia commented 2 years ago

Just going through: I still get the same error in LitFlow (I do see the keywords) Is it me who is doing something wrong? image

image

PietroLiuzzo commented 2 years ago

as reported, this is likely to have something to do with the indexes. As you read in the error it is the $field parameter of the range indexes function which is getting nothing.

eu-genia commented 2 years ago

gender gives few results, just 4 women, some data must be missing... it used to be more https://github.com/BetaMasaheft/Documentation/issues/1450

i did repopulate the rdf folders in the app as suggested https://github.com/BetaMasaheft/Documentation/issues/1972

eu-genia commented 2 years ago

the transformation result was wrong as all rdfs had gender male. xslt corrected, persons reuploaded to fuseki. now the women are 81 (and 1071 male)

the loading times for sparql through xq are very slow, it works faster from the endpoint at https://betamasaheft.eu/fuseki/dataset.html

eu-genia commented 1 year ago

visualization of sparql results with with d3sparql is not working

also graphs etc for gender are not working

eu-genia commented 7 months ago

rdf files are not produced automatically fuseki bm database is not being updated since the new release (except for dillmann, dillmann works ok) image

dillmann image

bm image