eclipse-rdf4j / rdf4j

Eclipse RDF4J: scalable RDF for Java
https://rdf4j.org/
BSD 3-Clause "New" or "Revised" License
361 stars 163 forks source link

NullPointerException in `ExtensibleDynamicEvaluationStatistics$ExtensibleDynamicEvaluationStatisticsCardinalityCalculator.getContextCardinality` #5133

Open hurui200320 opened 5 days ago

hurui200320 commented 5 days ago

Current Behavior

I'm currently implementing a storage sail based on the experimental extensible store. When applied the RepositorySPARQLComplianceTestSuite to it (using new SailRepository(new FooStore())), two tests failed because of NPE. The stacktrace shows the method is ExtensibleDynamicEvaluationStatistics$ExtensibleDynamicEvaluationStatisticsCardinalityCalculator.getContextCardinality:

        @Override
        protected double getContextCardinality(Var var) {
            synchronized (monitor) {
                if (var.getValue() == null) { // This line!! Because var itself is null
                    return defaultContext.cardinality() - defaultContext_removed.cardinality();
                } else {
                    return getHllCardinality(contextIndex, contextIndex_removed, var.getValue());
                }
            }
        }

The stacktrace:

Cannot invoke "org.eclipse.rdf4j.query.algebra.Var.getValue()" because "var" is null
java.lang.NullPointerException: Cannot invoke "org.eclipse.rdf4j.query.algebra.Var.getValue()" because "var" is null
    at org.eclipse.rdf4j.sail.extensiblestore.evaluationstatistics.ExtensibleDynamicEvaluationStatistics$ExtensibleDynamicEvaluationStatisticsCardinalityCalculator.getContextCardinality(ExtensibleDynamicEvaluationStatistics.java:231)
    at org.eclipse.rdf4j.query.algebra.evaluation.impl.EvaluationStatistics$CardinalityCalculator.meet(EvaluationStatistics.java:114)
    at org.eclipse.rdf4j.query.algebra.ZeroLengthPath.visit(ZeroLengthPath.java:181)
    at org.eclipse.rdf4j.query.algebra.evaluation.impl.EvaluationStatistics$CardinalityCalculator.meetBinaryTupleOperator(EvaluationStatistics.java:295)
    at org.eclipse.rdf4j.query.algebra.helpers.AbstractQueryModelVisitor.meet(AbstractQueryModelVisitor.java:473)
    at org.eclipse.rdf4j.query.algebra.Union.visit(Union.java:60)
    at org.eclipse.rdf4j.query.algebra.evaluation.impl.EvaluationStatistics$CardinalityCalculator.meetUnaryTupleOperator(EvaluationStatistics.java:304)
    at org.eclipse.rdf4j.query.algebra.helpers.AbstractQueryModelVisitor.meet(AbstractQueryModelVisitor.java:403)
    at org.eclipse.rdf4j.query.algebra.Projection.visit(Projection.java:80)
    at org.eclipse.rdf4j.query.algebra.evaluation.impl.EvaluationStatistics$CardinalityCalculator.meetUnaryTupleOperator(EvaluationStatistics.java:304)
    at org.eclipse.rdf4j.query.algebra.helpers.AbstractQueryModelVisitor.meet(AbstractQueryModelVisitor.java:208)
    at org.eclipse.rdf4j.query.algebra.Distinct.visit(Distinct.java:32)
    at org.eclipse.rdf4j.query.algebra.evaluation.impl.EvaluationStatistics.getCardinality(EvaluationStatistics.java:61)
    at org.eclipse.rdf4j.query.algebra.evaluation.optimizer.QueryJoinOptimizer$JoinVisitor.meet(QueryJoinOptimizer.java:196)
    at org.eclipse.rdf4j.query.algebra.Join.visit(Join.java:59)
    at org.eclipse.rdf4j.query.algebra.UnaryTupleOperator.visitChildren(UnaryTupleOperator.java:72)
    at org.eclipse.rdf4j.query.algebra.Extension.visitChildren(Extension.java:99)
    at org.eclipse.rdf4j.query.algebra.helpers.AbstractSimpleQueryModelVisitor.meetUnaryTupleOperator(AbstractSimpleQueryModelVisitor.java:595)
    at org.eclipse.rdf4j.query.algebra.helpers.AbstractSimpleQueryModelVisitor.meet(AbstractSimpleQueryModelVisitor.java:234)
    at org.eclipse.rdf4j.query.algebra.Extension.visit(Extension.java:94)
    at org.eclipse.rdf4j.query.algebra.UnaryTupleOperator.visitChildren(UnaryTupleOperator.java:72)
    at org.eclipse.rdf4j.query.algebra.Projection.visitChildren(Projection.java:86)
    at org.eclipse.rdf4j.query.algebra.helpers.AbstractSimpleQueryModelVisitor.meetUnaryTupleOperator(AbstractSimpleQueryModelVisitor.java:595)
    at org.eclipse.rdf4j.query.algebra.helpers.AbstractSimpleQueryModelVisitor.meet(AbstractSimpleQueryModelVisitor.java:414)
    at org.eclipse.rdf4j.query.algebra.Projection.visit(Projection.java:80)
    at org.eclipse.rdf4j.query.algebra.UnaryTupleOperator.visitChildren(UnaryTupleOperator.java:72)
    at org.eclipse.rdf4j.query.algebra.helpers.AbstractSimpleQueryModelVisitor.meet(AbstractSimpleQueryModelVisitor.java:430)
    at org.eclipse.rdf4j.query.algebra.QueryRoot.visit(QueryRoot.java:41)
    at org.eclipse.rdf4j.query.algebra.evaluation.optimizer.QueryJoinOptimizer.optimize(QueryJoinOptimizer.java:99)
    at org.eclipse.rdf4j.query.algebra.evaluation.impl.DefaultEvaluationStrategy.optimize(DefaultEvaluationStrategy.java:330)
    at org.eclipse.rdf4j.sail.base.SailSourceConnection.evaluateInternal(SailSourceConnection.java:251)
    at org.eclipse.rdf4j.sail.helpers.AbstractSailConnection.evaluate(AbstractSailConnection.java:333)
    at org.eclipse.rdf4j.repository.sail.SailTupleQuery.evaluate(SailTupleQuery.java:52)
    at org.eclipse.rdf4j.testsuite.sparql.tests.BindTest.testBindError(BindTest.java:68)

Expected Behavior

The code should not throw NullPointerException. From this getter code, it looks like providing a default context if the var doesn't have one. So the following code will work:

        @Override
        protected double getContextCardinality(Var var) {
            synchronized (monitor) {
                if (var == null || var.getValue() == null) {
                    return defaultContext.cardinality() - defaultContext_removed.cardinality();
                } else {
                    return getHllCardinality(contextIndex, contextIndex_removed, var.getValue());
                }
            }
        }

The var == null will eliminate the NPE, but I'm not sure if that's the correct way to handle. Maybe the parameter var should not be null at the first place?

Steps To Reproduce

  1. Apply the RepositorySPARQLComplianceTestSuite (from rdf4j-sparql-testsuite) to ElasticsearchStore (from rdf4j-sail-elasticsearch-store)
  2. Run the test
  3. The org.eclipse.rdf4j.testsuite.sparql.tests.BindTest#testBindError should fail because of NPE

A kotlin code for the test:

import org.eclipse.rdf4j.repository.Repository
import org.eclipse.rdf4j.repository.config.RepositoryFactory
import org.eclipse.rdf4j.repository.config.RepositoryImplConfig
import org.eclipse.rdf4j.repository.sail.SailRepository
import org.eclipse.rdf4j.sail.elasticsearchstore.ElasticsearchStore
import org.eclipse.rdf4j.testsuite.sparql.RepositorySPARQLComplianceTestSuite

class JenaTest : RepositorySPARQLComplianceTestSuite(
    object : RepositoryFactory {
        override fun getRepositoryType(): String = "sail"

        override fun getConfig(): RepositoryImplConfig? = null

        override fun getRepository(config: RepositoryImplConfig?): Repository {
            return SailRepository(
                ElasticsearchStore(
                    "127.0.0.1", 9300,
                    "docker-cluster", "index")
            )
        }
    }
) 

Version

5.0.2

Are you interested in contributing a solution yourself?

Perhaps?

Anything else?

https://github.com/eclipse-rdf4j/rdf4j/blob/main/core/sail/extensible-store/src/main/java/org/eclipse/rdf4j/sail/extensiblestore/evaluationstatistics/ExtensibleDynamicEvaluationStatistics.java#L229-L237

hurui200320 commented 5 days ago

Additional note: I applied the same test to the new SailRepository(new MemoryStore()), and it has no issue with finishing all the tests. So the issue is likely related to the extensible store. The var == null eliminate the NPE and the tests are passed, but I'm not sure if this really solves the issue.

hmottestad commented 5 days ago

Seems a bit strange that it's null to begin with. Should probably find out if it being null means either the union of all graphs or the default unnamed graph.