AKSW / RDFUnit

An RDF Unit Testing Suite
http://RDFUnit.aksw.org
Apache License 2.0
150 stars 42 forks source link

Remove the dependency to LOV for every validation #34

Closed jimkont closed 5 years ago

jimkont commented 9 years ago

LOV is not very stable any more and results in failed RDFUnit executions. instead generate and store the vocabulary metadata offline and save them as a Java resource which will be updated before each release

rtroncy commented 9 years ago

This is quite a strong statement. LOV had difficulties over the last few weeks due to attacks on OKFN servers and lack of response from the OKFN team. The LOV team aims definitively to be responsive and reliable. I suggest you discuss this issue with @pyvandenbussche

jimkont commented 9 years ago

LOV is a great service and I plan to continue depending on it. What happens now is that RDFUnit calls LOV at the start of every evaluation and if it fails we cannot do any prefix dereferencing or auto schema discovery. This task is about to cache the vocabularies locally for faster access and give the option to additionally call LOV for up-to-date data. So calling the online service would be optional.

I changed the issue title to make the statement less strong :)

jimkont commented 9 years ago

BTW, @pyvandenbussche I again had to put the internal LOV URI to get results from the endpoint https://github.com/AKSW/RDFUnit/commit/6d0a0be2c055c2ccca08328f3a43eb0e0fa5f0c3 : http://s110.okserver.org:3030/lov/sparql

pyvandenbussche commented 9 years ago

Hi Dimitris,

that looks like a very sounded approach. Having a cache in case LOV is not available and in case id is, run the query live.

I will look at the URI again. You can as well now consider using the LOV Linked Data Fragment https://plus.google.com/+PierreYvesVandenbussche/posts/Bb3dRqCp3hi

Cheers.

jimkont commented 9 years ago

FYI, I already had caching on the client with 1 week expiration and this issue is about putting the cache directly in the repo

roland-c commented 9 years ago

Since LOV has been down lately the adress of the sparql endpoint is changed, also an additional graph specification is needed in the query. The following changes must be made to fix this:

https://github.com/AKSW/RDFUnit/blob/master/rdfunit-core/src/main/java/org/aksw/rdfunit/Utils/RDFUnitUtils.java#L105 >>
Source lov = new EndpointTestSource("lov", "http://lov.okfn.org", "http://lov.okfn.org/dataset/lov/sparql", Arrays.asList("http://lov.okfn.org/dataset/lov"), null);

https://github.com/AKSW/RDFUnit/blob/master/rdfunit-core/src/main/java/org/aksw/rdfunit/Utils/RDFUnitUtils.java#L129 >>
"WHERE{ GRAPH http://lov.okfn.org/dataset/lov{ \n" +

https://github.com/AKSW/RDFUnit/blob/master/rdfunit-core/src/main/java/org/aksw/rdfunit/Utils/RDFUnitUtils.java#L134 >>
"}} \n" +

pyvandenbussche commented 9 years ago

Hi Dimitris,

I've cleaned the way to access LOV sparql endpoint. I blocked the internal URL and now the only URL for LOV endpoint is: http://lov.okfn.org/dataset/lov/sparql

This URL will not change in the future.

jimkont commented 9 years ago

Hi @pyvandenbussche I changed the new URL and I get the following error

java.lang.RuntimeException: com.hp.hpl.jena.query.QueryException: Endpoint returned Content-Type: text/html which is not currently supported for SELECT queries

also make sure you allow the additional url parameters default-graph-uri & named-graph-uri to be passed to the internal endpoint

jimkont commented 9 years ago

IIRC we had the same problem before and you told be it was a jena v2.11 vs v2.12+ issue

pyvandenbussche commented 9 years ago

I try to find back our discussion on that topic to see what was the solution :)

pyvandenbussche commented 9 years ago

Now everything is working using the permanent URL: http://lov.okfn.org/dataset/lov/sparql Thanks @jimkont for your help.