eclipse-rdf4j / rdf4j

Eclipse RDF4J: scalable RDF for Java
https://rdf4j.org/
BSD 3-Clause "New" or "Revised" License
361 stars 163 forks source link

Eclipse RDF4j queries are not returning results for 2.3 or newer version #2310

Closed NarendraSadhu closed 4 years ago

NarendraSadhu commented 4 years ago

We are using an Rdf4j SPARQLRepository to access a remote Blazegraph server and query it. If we use rdf4j 2.2, this works as expected. However, if you use rdf4j 2.3 or newer, this does not work and queries are not returning result.

abrokenjester commented 4 years ago

Can you provide more information about the circumstances: what version of Blazegraph are you using, and an example query and data set where the problem occurs, what does the exact configuration of your SPARQLRepository look like (endpoint urls, any authentication, etc).

abrokenjester commented 4 years ago

Possible duplicate of #1393 (which was closed because we couldn't reproduce).

NarendraSadhu commented 4 years ago

SPARQLRepository Configuration:

    @Bean
public Repository getRepository() {
    // Blaze graph
    String sparqlEndpoint = "http://localhost:9999/blazegraph/sparql"
    return new SPARQLRepository(sparqlEndpoint);
}

Sample Query:

prefix projects: <http://www.siemens.com/krawalcloud/projects/> 
SELECT DISTINCT ?objectCounter 
WHERE {?parent  projects:hasUUID "ad0ee396-dcc5-4395-b23e-3c476cea3330"^^xsd:string ;
                projects:hasVariant ?variant.
       ?variant projects:hasUUID "a3fea6fa-3f0b-4ef5-8e8e-b2a8397f435b"^^xsd:string ;
                projects:hasObjectCounter  ?objectCounter.
      }
abrokenjester commented 4 years ago

Can you show me what the result of this query is when you execute it directly on blazegraph (instead of via the RDF4J SPARQLRepository)?

NarendraSadhu commented 4 years ago

@jeenbroekstra Please find the attached image. Getting query result when we are executing in Blazegraph. But When we are executing via RDF4J SPARQLRepository with version 2.3 or newer getting empty result. Query Result

abrokenjester commented 4 years ago

Based on this information I can not reproduce the issue, unfortunately. On a locally running blazegraph (2.1.4) I added minimal data to match your query:

prefix projects: <http://www.siemens.com/krawalcloud/projects/>
INSERT DATA {
<urn:project1>  projects:hasUUID "ad0ee396-dcc5-4395-b23e-3c476cea3330"^^xsd:string ;
                projects:hasVariant <urn:variant-1>.
       <urn:variant-1> projects:hasUUID "a3fea6fa-3f0b-4ef5-8e8e-b2a8397f435b"^^xsd:string ;
                projects:hasObjectCounter 2.
}

Then running your sparql query via a SPARQLRepository (using RDF4J 3.2.1), I get back the expected result:

    Repository rep = new SPARQLRepository("http://localhost:10080/blazegraph/sparql");

    try (RepositoryConnection conn = rep.getConnection()) {
        String query = "prefix projects: <http://www.siemens.com/krawalcloud/projects/> \n" +
                "SELECT DISTINCT ?objectCounter \n" +
                "WHERE {?parent     projects:hasUUID \"ad0ee396-dcc5-4395-b23e-3c476cea3330\"^^xsd:string ;\n" +
                "                   projects:hasVariant ?variant.\n" +
                "       ?variant projects:hasUUID \"a3fea6fa-3f0b-4ef5-8e8e-b2a8397f435b\"^^xsd:string ;\n" +
                "                   projects:hasObjectCounter  ?objectCounter.\n" +
                "      }";
        TupleQueryResult result = conn.prepareTupleQuery(query).evaluate();

        result.forEach(System.out::println);
    }

output:

 [objectCounter="2"^^<http://www.w3.org/2001/XMLSchema#integer>]
abrokenjester commented 4 years ago

Could you try and run the following curl command:

curl -v http://localhost:9999/blazegraph/sparql -d 'query=prefix projects: <http://www.siemens.com/krawalcloud/projects/>
SELECT DISTINCT ?objectCounter
WHERE {?parent  projects:hasUUID "ad0ee396-dcc5-4395-b23e-3c476cea3330"^^xsd:string ;
                projects:hasVariant ?variant.
       ?variant projects:hasUUID "a3fea6fa-3f0b-4ef5-8e8e-b2a8397f435b"^^xsd:string ;
                projects:hasObjectCounter  ?objectCounter.
      }'

and tell me the complete output you see?

When I run this locally, I get this:

*   Trying 127.0.0.1:9999...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 9999 (#0)
> POST /blazegraph/sparql HTTP/1.1
> Host: localhost:9999
> User-Agent: curl/7.68.0
> Accept: */*
> Content-Length: 387
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 387 out of 387 bytes
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Type: application/sparql-results+xml
< Transfer-Encoding: chunked
< Server: Jetty(9.3.9.v20160517)
<
<?xml version='1.0' encoding='UTF-8'?>
<sparql xmlns='http://www.w3.org/2005/sparql-results#'>
        <head>
                <variable name='objectCounter'/>
        </head>
        <results>
                <result>
                        <binding name='objectCounter'>
                                <literal datatype='http://www.w3.org/2001/XMLSchema#integer'>2</literal>
                        </binding>
                </result>
        </results>
</sparql>
* Connection #0 to host localhost left intact
NarendraSadhu commented 4 years ago

we tried to execute above curl command getting below error command:

curl -v http://localhost:9999/blazegraph/sparql -d 'query=prefix projects: http://www.siemens.com/krawalcloud/projects/ SELECT DISTINCT ?objectCounter WHERE {?parent projects:hasUUID "ad0ee396-dcc5-4395-b23e-3c476cea3330"^^xsd:string ; projects:hasVariant ?variant. ?variant projects:hasUUID "a3fea6fa-3f0b-4ef5-8e8e-b2a8397f435b"^^xsd:string ; projects:hasObjectCounter ?objectCounter.}' The filename, directory name, or volume label syntax is incorrect.

But we are able to connect Blazegraph by using below command

command:

curl -v http://localhost:9999/blazegraph/sparql

* Connection #0 to host localhost left intact
abrokenjester commented 4 years ago

Not sure why you're getting an error on that, but it have may to do with newlines in the query. Can you try this instead:

 curl -v http://localhost:10080/blazegraph/sparql -d 'query=prefix projects: <http://www.siemens.com/krawalcloud/projects/> SELECT DISTINCT ?parent ?objectCounter WHERE {?parent     projects:hasUUID "ad0ee396-dcc5-4395-b23e-3c476cea3330"^^xsd:string ;                 projects:hasVariant ?variant.       ?variant projects:hasUUID "a3fea6fa-3f0b-4ef5-8e8e-b2a8397f435b"^^xsd:string ;                    projects:hasObjectCounter  ?objectCounter.      }'

It's really important that I see the full result of the actual SPARQL query as it gets sent in the response (as well as the response headers), because I want to determine if the problem is somehow caused by a syntax issue in your data (though it seems unlikely).

Alternatively, we will have to start looking at log files, in particular the logs of Blazegraph itself (incoming request, outgoing response when you do the query), and hopefully also what RDF4J receives.

abrokenjester commented 4 years ago

If you have an option to share (part of) your dataset with me so that I can upload to blazegraph and try and reproduce that way, that would be great.

abrokenjester commented 4 years ago

Finally, another thing to look at might be the Java code you use to execute the query and process the result.

NarendraSadhu commented 4 years ago

@jeenbroekstra we can't send the our project dataset. But we are using ModelBuilder class for insert data to Blazegraph. Please find below sample code.

        Repository rep = new SPARQLRepository("http://localhost:9999/blazegraph/sparql");
    rep.initialize();
    try (RepositoryConnection conn = rep.getConnection()) {
        ModelBuilder builder = new ModelBuilder();

        String projectUUID = "ad0ee396-dcc5-4395-b23e-3c476cea3330";
        String variantUUID = "a3fea6fa-3f0b-4ef5-8e8e-b2a8397f435b";

        String projectVariant = projectUUID + "/variants/" + variantUUID;

        builder.setNamespace("root", "http://www.siemens.com/krawalcloud/projects/").setNamespace(RDF.NS);

        builder.subject("http://www.siemens.com/krawalcloud/projects/" + projectUUID)
                .add("http://www.siemens.com/krawalcloud/projects/" + "hasUUID", projectUUID)
                .add("http://www.siemens.com/krawalcloud/projects/" + "hasVariant", conn.getValueFactory().createIRI(("http://www.siemens.com/krawalcloud/projects/" + projectVariant)));

        builder.subject("http://www.siemens.com/krawalcloud/projects/" + projectVariant)
                .add("http://www.siemens.com/krawalcloud/projects/" + "hasUUID", variantUUID)
                .add("http://www.siemens.com/krawalcloud/projects/" + "hasObjectCounter", 2);

        Model model = builder.build();
        conn.add(model);

We are not facing any issues while inserting data but when we try to fetch data getting result with version 2.2 and getting empty result with newer versions.

Could you try to insert data using ModelBuilder (2.2 version) and fetch the data with different versions.

abrokenjester commented 4 years ago

I still can not reproduce. I get back a result from Blazegraph (2.1.4), as expected, whether I use RDF4J 2.2.1, 2.3.1, or 3.2.2.

To move forward on this, some further questions and things to investigate:

  1. can you tell me how large your Blazegraph data base is, in terms of the number of statements? Ballpark figure.
  2. is it just this one query you have shown here that is not giving a result, or are all queries suddenly not working?
  3. have a look in the RDF4J and Blazegraph logs, see if anything suspicious turns up. If necessary, at the RDF4J end, configure logging to use DEBUG level to get more detailed information about what happens when it sends a query and receives a response.
  4. can you show exactly how you execute the query and process the result - that is the code used to open a connection, create the query, evaluate it, and then process the query result.
  5. try and run the curl command I gave you earlier .

More generally: if there is any way in which you can create a complete, minimal, verifiable example (data + code + query) that demonstrates the problem in a way that we can reproduce, that would be really helpful.

hmottestad commented 4 years ago

Just popping my head in here to 1+ the need for a reproducible example. At this point it’s pretty much the only way for us to find out what is going on.

abrokenjester commented 4 years ago

I'm closing this for now as we cannot reproduce the problem. If you can provide further details, feel free to comment and we can re-open.