SmartDataAnalytics / jena-sparql-api

A collection of Jena-extensions for hiding SPARQL-complexity from the application layer
Other
57 stars 14 forks source link

Unsupported operation exception when calling tryMatch #35

Open fanavarro opened 4 years ago

fanavarro commented 4 years ago

Hi, I need to check if two sparql queries are equivalent and I think this library is useful for that. Nonetheless, I am using the tryMatch function from SparqlQueryContainmentUtils class, and I am receiving the following exception:

Exception in thread "main" java.lang.UnsupportedOperationException
    at org.aksw.jena_sparql_api.algebra.analysis.VarUsageAnalyzer2Visitor.visit(VarUsageAnalyzer2Visitor.java:216)
    at org.apache.jena.sparql.algebra.op.OpLeftJoin.visit(OpLeftJoin.java:64)
    at org.aksw.jena_sparql_api.algebra.analysis.VarUsageAnalyzer2Visitor.visit(VarUsageAnalyzer2Visitor.java:369)
    at org.apache.jena.sparql.algebra.op.OpFilter.visit(OpFilter.java:132)
    at org.aksw.jena_sparql_api.algebra.analysis.VarUsageAnalyzer2Visitor.visit(VarUsageAnalyzer2Visitor.java:307)
    at org.apache.jena.sparql.algebra.op.OpExt.visit(OpExt.java:57)
    at org.aksw.jena_sparql_api.algebra.analysis.VarUsageAnalyzer2Visitor.analyze(VarUsageAnalyzer2Visitor.java:487)
    at org.aksw.jena_sparql_api.algebra.analysis.VarUsageAnalyzer2Visitor.analyze(VarUsageAnalyzer2Visitor.java:481)
    at org.aksw.jena_sparql_api.query_containment.core.SparqlQueryContainmentUtils.tryMatch(SparqlQueryContainmentUtils.java:199)
    at org.aksw.jena_sparql_api.query_containment.core.SparqlQueryContainmentUtils.tryMatch(SparqlQueryContainmentUtils.java:155)
    at test.Test.main(Test.java:24)

The queries that I would like to compare are the following:

SELECT  ?instancia ?instanciaLabel ?i0 ?predicate_i0 ?i0Label
FROM <http://dbpedia.org>
WHERE
  { ?instancia  a                   <http://dbpedia.org/ontology/Musical> ;
              ?predicate_i0         ?i0
    FILTER ( ?predicate_i0 IN (<http://dbpedia.org/ontology/musicBy>) )
    OPTIONAL
      { ?instancia  <http://www.w3.org/2000/01/rdf-schema#label>  ?instanciaLabel
        FILTER ( lang(?instanciaLabel) = "en" )
      }
    OPTIONAL
      { ?i0  <http://www.w3.org/2000/01/rdf-schema#label>  ?i0Label
        FILTER ( lang(?i0Label) = "en" )
      }
  }
SELECT  ?i0 ?predicate_i0 ?i0Label ?instancia ?instanciaLabel
FROM <http://dbpedia.org>
WHERE
  { ?instancia  a                   <http://dbpedia.org/ontology/Musical> ;
              ?predicate_i0         ?i0
    FILTER ( ?predicate_i0 IN (<http://dbpedia.org/ontology/musicBy>) )
    OPTIONAL
      { ?i0  <http://www.w3.org/2000/01/rdf-schema#label>  ?i0Label
        FILTER ( lang(?i0Label) = "en" )
      }
    OPTIONAL
      { ?instancia  <http://www.w3.org/2000/01/rdf-schema#label>  ?instanciaLabel
        FILTER ( lang(?instanciaLabel) = "en" )
      }
  }

As you can see, both queries are the same, but variables and statements are unsorted. My code to do that is the following:

package test;

import org.aksw.jena_sparql_api.query_containment.core.SparqlQueryContainmentUtils;
import org.apache.jena.query.Query;
import org.apache.jena.query.QueryFactory;
import org.apache.jena.query.Syntax;

public class Test {

    public static void main(String[] args) {
        Query q1 = QueryFactory.create(getQ1(), Syntax.syntaxSPARQL_11);
        Query q2 = QueryFactory.create(getQ2(), Syntax.syntaxSPARQL_11);
        boolean isContained = SparqlQueryContainmentUtils.tryMatch(q1, q2);

        if (isContained) {
            System.out.println("Contained.");
        } else {
            System.out.println("Not contained.");
        }
    }

    private static String getQ1() {
        StringBuilder sb = new StringBuilder();
        sb.append("SELECT  ?instancia ?instanciaLabel ?i0 ?predicate_i0 ?i0Label\n");
        sb.append("FROM <http://dbpedia.org>\n");
        sb.append("WHERE\n");
        sb.append("  { ?instancia  a                   <http://dbpedia.org/ontology/Musical> ;\n");
        sb.append("              ?predicate_i0         ?i0\n");
        sb.append("    FILTER ( ?predicate_i0 IN (<http://dbpedia.org/ontology/musicBy>) )\n");
        sb.append("    OPTIONAL\n");
        sb.append("      { ?instancia  <http://www.w3.org/2000/01/rdf-schema#label>  ?instanciaLabel\n");
        sb.append("        FILTER ( lang(?instanciaLabel) = \"en\" )\n");
        sb.append("      }\n");
        sb.append("    OPTIONAL\n");
        sb.append("      { ?i0  <http://www.w3.org/2000/01/rdf-schema#label>  ?i0Label\n");
        sb.append("        FILTER ( lang(?i0Label) = \"en\" )\n");
        sb.append("      }\n");
        sb.append("  }\n");
        return sb.toString();
    }

    private static String getQ2() {
        StringBuilder sb = new StringBuilder();
        sb.append("SELECT  ?i0 ?predicate_i0 ?i0Label ?instancia ?instanciaLabel\n");
        sb.append("FROM <http://dbpedia.org>\n");
        sb.append("WHERE\n");
        sb.append("  { ?instancia  a                   <http://dbpedia.org/ontology/Musical> ;\n");
        sb.append("              ?predicate_i0         ?i0\n");
        sb.append("    FILTER ( ?predicate_i0 IN (<http://dbpedia.org/ontology/musicBy>) )\n");
        sb.append("    OPTIONAL\n");
        sb.append("      { ?i0  <http://www.w3.org/2000/01/rdf-schema#label>  ?i0Label\n");
        sb.append("        FILTER ( lang(?i0Label) = \"en\" )\n");
        sb.append("      }\n");
        sb.append("    OPTIONAL\n");
        sb.append("      { ?instancia  <http://www.w3.org/2000/01/rdf-schema#label>  ?instanciaLabel\n");
        sb.append("        FILTER ( lang(?instanciaLabel) = \"en\" )\n");
        sb.append("      }\n");
        sb.append("  }\n");
        return sb.toString();
    }
}

I would like to know the limitations in the sparql operators that we can use with this library.

Thanks beforehand.

Aklakan commented 4 years ago

I am looking into it; on first glance it looks like its a bug in the code. In principle the approach should be exact for conjunctive queries (basic graph patterns + filters + projection); for other SPARQL operators a best effort approach is done by trying to enumerate all possible mappings of the two queries' normalized algebra expressions' leaf nodes (which were converted to conjunctive queries first), and then trying whether moving bottom-up can still map the variables/expressions of the intermediate nodes.

So the approach works bottom up based on mapping the leafs and then trying to map the variables / expressions intermediate operations while moving up. Conversely, it does not try to enumerate further equivalence transformations of a given algebra expression for performance reasons.