blazegraph / database

Blazegraph High Performance Graph Database
GNU General Public License v2.0
891 stars 172 forks source link

Service URI is not allowed (but URI is good) #102

Closed redskate closed 6 years ago

redskate commented 6 years ago

Dear Blazegraph community

I am using bigdata.war version 2.1.5 (but also with 2.1.4 the following error arose) and I need to formulate a federated query getting data from a blazegraph instance running on https://ch.semweb.ch:8433/bigdata

The Query I am using is the following:

prefix dbc: <http://dbpedia.org/resource/Category:>
prefix dct: <http://purl.org/dc/terms/>
prefix foaf:        <http://xmlns.com/foaf/0.1/>

SELECT *
{
    SERVICE <https://ch.semweb.ch:8433/bigdata/namespace/DBPedia/sparql> 
    {
      SELECT ?surname 
      { 
          ?musiker dct:subject dbc:Baroque_composers ;
          foaf:surname ?surname .
      }
     }
}

Although <https://ch.semweb.ch:8433/bigdata/namespace/DBPedia/sparql> is a valid URI, the querying RDF store issues an error (whose text is probably wrong) saying:

Unknown error: Service URI https://ch.semweb.ch:8433/bigdata/namespace/DBPedia/sparql is not allowed ???

The same query but using <http://dbpedia.org/sparql> as a SERVICE URI is running well.

But this Service URI should indeed be allowed. What am I doing wrong? How can I get the results ? How should I configure bigdata in order to "allow" this URI in the SERVICE query from "OUTSIDE"? The query was called inside https://query.wikidata.org/ which uses again a blazegraph engine. Using virtuoso gives another error (see below).

This must be a problem using that URI - bigdata server (https://ch.semweb.ch:8433/bigdata) seems (from the logs) not even to get a request. ... no log reaction.

Thanks in advance Regards Fabio

PS: The exception reported using wikidata sparql server is:

SPARQL-QUERY: queryStr=prefix dbc:  <http://dbpedia.org/resource/Category:>
prefix dct: <http://purl.org/dc/terms/>
prefix foaf:        <http://xmlns.com/foaf/0.1/>

SELECT *
{
    SERVICE <https://ch.semweb.ch:8433/bigdata/namespace/DBPedia/sparql> 
    {
      SELECT ?surname 
      { 
          ?musiker dct:subject dbc:Baroque_composers ;
          foaf:surname ?surname .
      }
     }
}
java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: Service URI https://ch.semweb.ch:8433/bigdata/namespace/DBPedia/sparql is not allowed
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:206)
    at com.bigdata.rdf.sail.webapp.BigdataServlet.submitApiTask(BigdataServlet.java:293)
    at com.bigdata.rdf.sail.webapp.QueryServlet.doSparqlQuery(QueryServlet.java:654)
    at com.bigdata.rdf.sail.webapp.QueryServlet.doGet(QueryServlet.java:288)
    at com.bigdata.rdf.sail.webapp.RESTServlet.doGet(RESTServlet.java:240)
    at com.bigdata.rdf.sail.webapp.MultiTenancyServlet.doGet(MultiTenancyServlet.java:271)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:769)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1667)
    at org.wikidata.query.rdf.blazegraph.throttling.ThrottlingFilter.doFilter(ThrottlingFilter.java:318)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1650)
    at ch.qos.logback.classic.helpers.MDCInsertingServletFilter.doFilter(MDCInsertingServletFilter.java:49)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1650)
    at org.wikidata.query.rdf.blazegraph.filters.ClientIPFilter.doFilter(ClientIPFilter.java:43)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1650)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
    at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
    at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
    at org.eclipse.jetty.server.Server.handle(Server.java:497)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)
    at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:610)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:539)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: Service URI https://ch.semweb.ch:8433/bigdata/namespace/DBPedia/sparql is not allowed
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:192)
    at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlQueryTask.call(QueryServlet.java:865)
    at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlQueryTask.call(QueryServlet.java:671)
    at com.bigdata.rdf.task.ApiTaskForIndexManager.call(ApiTaskForIndexManager.java:68)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    ... 1 more
Caused by: java.lang.IllegalArgumentException: Service URI https://ch.semweb.ch:8433/bigdata/namespace/DBPedia/sparql is not allowed
    at com.bigdata.rdf.sparql.ast.service.ServiceRegistry.getServiceFactoryByServiceURI(ServiceRegistry.java:498)
    at com.bigdata.rdf.sparql.ast.service.ServiceNode.getResponsibleServiceFactory(ServiceNode.java:443)
    at com.bigdata.rdf.sparql.ast.service.ServiceNode.getRequiredBound(ServiceNode.java:408)
    at com.bigdata.rdf.sparql.ast.GroupNodeVarBindingInfo.<init>(GroupNodeVarBindingInfo.java:85)
    at com.bigdata.rdf.sparql.ast.GroupNodeVarBindingInfoMap.<init>(GroupNodeVarBindingInfoMap.java:62)
    at com.bigdata.rdf.sparql.ast.optimizers.ASTJoinGroupOrderOptimizer.optimizeJoinGroup(ASTJoinGroupOrderOptimizer.java:104)
    at com.bigdata.rdf.sparql.ast.optimizers.AbstractJoinGroupOptimizer.optimize(AbstractJoinGroupOptimizer.java:161)
    at com.bigdata.rdf.sparql.ast.optimizers.AbstractJoinGroupOptimizer.optimize(AbstractJoinGroupOptimizer.java:101)
    at com.bigdata.rdf.sparql.ast.optimizers.ASTOptimizerList.optimize(ASTOptimizerList.java:126)
    at com.bigdata.rdf.sparql.ast.eval.AST2BOpUtility.convert(AST2BOpUtility.java:269)
    at com.bigdata.rdf.sparql.ast.eval.ASTEvalHelper.optimizeQuery(ASTEvalHelper.java:426)
    at com.bigdata.rdf.sparql.ast.eval.ASTEvalHelper.evaluateTupleQuery(ASTEvalHelper.java:212)
    at com.bigdata.rdf.sail.BigdataSailTupleQuery.evaluate(BigdataSailTupleQuery.java:79)
    at com.bigdata.rdf.sail.BigdataSailTupleQuery.evaluate(BigdataSailTupleQuery.java:61)
    at org.openrdf.repository.sail.SailTupleQuery.evaluate(SailTupleQuery.java:75)
    at com.bigdata.rdf.sail.webapp.BigdataRDFContext$TupleQueryTask.doQuery(BigdataRDFContext.java:1713)
    at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.innerCall(BigdataRDFContext.java:1569)
    at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:1534)
    at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:747)
    ... 4 more

PS: The warning reported calling that query from virtuoso (http://dbpedia.org/sparql) is:

Virtuoso 42000 Error SQ070:SECURITY: Must have select privileges on view DB.DBA.SPARQL_SINV_2

SPARQL query:

#output-format:text/html
define sql:signal-void-variables 1 define input:default-graph-uri <http://dbpedia.org> prefix dbc:  <http://dbpedia.org/resource/Category:>
prefix dct: <http://purl.org/dc/terms/>
prefix foaf:        <http://xmlns.com/foaf/0.1/>

SELECT *
{
    SERVICE <https://ch.semweb.ch:8433/bigdata/namespace/DBPedia/sparql> 
    {
      SELECT ?surname 
      { 
          ?musiker dct:subject dbc:Baroque_composers ;
          foaf:surname ?surname .
      }
     }
}
redskate commented 6 years ago

What I would like to compose and successfully execute is the following query:

#BPedia
prefix skos:<http://www.w3.org/2004/02/skos/core#>
prefix dbc: <http://dbpedia.org/resource/Category:>
prefix dct: <http://purl.org/dc/terms/>
prefix dbr: <http://dbpedia.org/resource/>
prefix dbp: <http://dbpedia.org/property/>

#Europeana
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX edm: <http://www.europeana.eu/schemas/edm/>
PREFIX ore: <http://www.openarchives.org/ore/terms/>

SELECT *
{

    ?musiker dct:subject dbc:Baroque_composers ;
    foaf:surname ?surname .

          SERVICE <http://sparql.europeana.eu/> {
            SELECT * 
            WHERE {

              ?CHO ore:proxyIn ?proxy;
              dc:title ?title ;
              dc:creator ?creator ;
              dc:date ?date .
              FILTER REGEX(str(?creator),str(?surname),"i").
              ?proxy edm:isShownBy ?mediaURL .
            } limit 100 
          } 

} limit 100

Which is so much to say: 1) search for composer names in DBPedia, then 2) search in Europeana for some works of them. This in order to DEMONSTRATE for a university, that the SERVICE functions... I choosed Blazegraph because I think it is a great engine. Unluckily I cannot manage to get this SERVICE running so easily.

thompsonbry commented 6 years ago

Blazegraph has the ability to enable all services or to whitelist specific services. This is described on the blazegraph wiki on the FederatedQuery page.

Bryan

On Mon, Aug 20, 2018, 04:35 redskate notifications@github.com wrote:

Dear Blazegraph community

I am using bigdata.war version 2.1.5 (but also with 2.1.4 the following error arose) and I need to formulate a federated query getting data from a blazegraph instance running on https://ch.semweb.ch:8433/bigdata

The Query I am using is the following:

prefix dbc: http://dbpedia.org/resource/Category: prefix dct: http://purl.org/dc/terms/ prefix foaf: http://xmlns.com/foaf/0.1/

SELECT * { SERVICE https://ch.semweb.ch:8433/bigdata/namespace/DBPedia/sparql { SELECT ?surname { ?musiker dct:subject dbc:Baroque_composers ; foaf:surname ?surname . } } }

Although https://ch.semweb.ch:8433/bigdata/namespace/DBPedia/sparql is a valid expression, the querying RDF store issues an error (whose text is probably wrong) saying:

Unknown error: Service URI https://ch.semweb.ch:8433/bigdata/namespace/DBPedia/sparql is not allowed

But this Service should be allowed. What am I doing wrong? How can I get the results ? How should I configure bigdata in order to "allow" this URI in the query? The query was issued in https://query.wikidata.org/ which is again using a blazegraph engine.

Thanks in advance Regards Fabio

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/102, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4IWvKpv8BMzULMxgACNpeFtklxZMks5uSp8AgaJpZM4WD1UT .

redskate commented 6 years ago

Hallo Bryan

On 20 Aug 2018, at 17:13, Bryan Thompson notifications@github.com wrote:

Blazegraph has the ability to enable all services or to whitelist specific services. This is described on the blazegraph wiki on the FederatedQuery page.

Bryan

On Mon, Aug 20, 2018, 04:35 redskate notifications@github.com wrote:

Dear Blazegraph community

I am using bigdata.war version 2.1.5 (but also with 2.1.4 the following error arose) and I need to formulate a federated query getting data from a blazegraph instance running on https://ch.semweb.ch:8433/bigdata

The Query I am using is the following:

prefix dbc: http://dbpedia.org/resource/Category: prefix dct: http://purl.org/dc/terms/ prefix foaf: http://xmlns.com/foaf/0.1/

SELECT * { SERVICE https://ch.semweb.ch:8433/bigdata/namespace/DBPedia/sparql { SELECT ?surname { ?musiker dct:subject dbc:Baroque_composers ; foaf:surname ?surname . } } }

Although https://ch.semweb.ch:8433/bigdata/namespace/DBPedia/sparql is a valid expression, the querying RDF store issues an error (whose text is probably wrong) saying:

Unknown error: Service URI https://ch.semweb.ch:8433/bigdata/namespace/DBPedia/sparql is not allowed

But this Service should be allowed. What am I doing wrong? How can I get the results ? How should I configure bigdata in order to "allow" this URI in the query? The query was issued in https://query.wikidata.org/ which is again using a blazegraph engine.

Thanks in advance Regards Fabio

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/102, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4IWvKpv8BMzULMxgACNpeFtklxZMks5uSp8AgaJpZM4WD1UT .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/102#issuecomment-414352825, or mute the thread https://github.com/notifications/unsubscribe-auth/AA7rramNPKfkA9fm0sz_g-CWpXWgbDxFks5uStIpgaJpZM4WD1UT.

redskate commented 6 years ago

in https://wiki.blazegraph.com/wiki/index.php/FederatedQuery (the page you implicitly mean) you write: If you want to restrict the Federated Query service URIs that are allowed in SPARQL queries, you need to configure a service URLs whitelist.

So I do not want - so nothing done. but furthermore you write:

An attempt of using a URL which is not whitelisted will cause an IllegalArgumentException error

So (somewhere) in web.xml one has to add "some" URI (which ones?) The URL of the SPARQL engines calling that blazegraph in a SERVICE ????

redskate commented 6 years ago

I will spend some further time figuring out what is meant here ;)))

redskate commented 6 years ago

And please note that https://www.bigdata.com/bigdata/docs/api/com/bigdata/rdf/sparql/ast/eval/SearchServiceFactory.html is broken ... to me

redskate commented 6 years ago

I opened web.xml, commented out the (commented) chunk and replaced the URI base "http://www.bigdata.com" with "https://ch.semweb.ch:8433", getting the expression below. Then stored web.xml, stopped&started tomcat, then I turned back to wikidata sparql server (a blazegraph) and got the same reaction.

<context-param>
   <description>List of allowed services.</description>
   <param-name>serviceWhitelist</param-name>
   <param-value>https://ch.semweb.ch:8433/rdf/search#search,https://ch.semweb.ch:8433/rdf#describe</param-value>
  </context-param>
redskate commented 6 years ago

Then I tried using https://ch.semweb.ch:8433/bigdata/namespace/DBPedia/sparql as a URL to whitelist there (see below) no change: Service still unavailable.

<context-param>
   <description>List of allowed services.</description>
   <param-name>serviceWhitelist</param-name>
   <param-value>https://ch.semweb.ch:8433/rdf/search#search,https://ch.semweb.ch:8433/bigdata/namespace/DBPedia/sparql,https://ch.semweb.ch:8433/rdf#describe</param-value>
  </context-param>
redskate commented 6 years ago

Finally I added to that web.xml parameter description also the URL of the calling engine - https://query.wikidata.org/ - without a sensible reaction. So it seems, with that description and these resources, it does not run ? Sorry.

<context-param>
   <description>List of allowed services.</description>
   <param-name>serviceWhitelist</param-name>
   <param-value>https://query.wikidata.org/, https://ch.semweb.ch:8433/rdf/search#search,https://ch.semweb.ch:8433/bigdata/namespace/DBPedia/sparql,https://ch.semweb.ch:8433/rdf#describe</param-value>
  </context-param>
redskate commented 6 years ago

Anyone else there please wanting to share some precise and informative piece of information (so that it might run?) - thanks in advance.

redskate commented 6 years ago

And this is the SOLUTION to this ISSUE:

No need to change web.xml in order to allow a bigdata instance to be used in an outer SPARQL engine. The problem was in both engines I used to issue those SERVICE calls to my bigdata instance.

I tried to issue the same call from a "sister" bigdata installation in my infrastructure and it run smoothly. Hope that will be useful for someone.

It run smoothly with the default web.xml whitelist as long as the SERVICE URL uses the HTTP schema.

kappagithub commented 3 years ago

We had the same problem with a container installation of Wikibase. The whitelist must be activated before it works. With containers this can easily be achieved by restarting the container