partiql / partiql-lang-kotlin

PartiQL libraries and tools in Kotlin.
https://partiql.org/
Apache License 2.0
538 stars 60 forks source link

Inconsistent Values Generated by Planner Through Iteration #833

Open johnedquinn opened 1 year ago

johnedquinn commented 1 year ago

Description

To Reproduce

Steps to reproduce the behavior:

  1. Example queries:
    • Remember, this won't fail in the CLI. Since it prints sequentially, all of the state registers will be appropriately allocated at materialization. If you, however, create an iterator of the resulting bag, and iterate and print in random orders, you'll always see the most up-to-date register as the result of the ExprValue.
      SELECT (SELECT i FROM <<0, 1>>) FROM << {'a': 1}, {'a': 5} >> LET a * 2 AS i
      
      // NOTE: This is for a different more complex query, but nonetheless, it shows what's wrong.
      val iterator = actualExprValueResult.iterator()  // this is a bag

val item1 = iterator.next() println(item1) println(item1.ionValue)

val item2 = iterator.next()

println(item1) println(item1.ionValue)

println(item2) println(item2.ionValue)

---- RESULT :

{'projListSubQuery': <<1>>} {projListSubQuery:$partiql_bag::[1]} {'projListSubQuery': <<5>>} {projListSubQuery:$partiql_bag::[1]} {'projListSubQuery': <<5>>} {projListSubQuery:$partiql_bag::[5]}



## Expected Behavior
- The resulting tuples of a SFW should be able to be accessed with correct values in whatever order.

## The Solution
- So, the solution was found after much debugging with @yliuuuu and @RCHowell . Essentially, at `compileBindingsToValues`, we need to mimic the clone of an ExprValue by taking the underlying IonValue and using it to create a new ExprValue using the ExprValueFactory. That way, it's not a reference to the global state's registers.
- While this is a great idea, it doesn't fully work, because when we grab the IonValue of a StructExprValue, it actually OMITS any MISSING attributes. So we don't have a 1:1 conversion when moving from PartiQL to Ion and back.
- So, two solutions:
  - We make ExprValue cloneable
  - Or, we allow StructExprValues to also add MISSING fields to the resulting IonValue. However, I spoke to @dlurton about this, and it goes against the spec. The spec, however, only enforces MISSING as a value within PartiQL tuples (not binding tuples).
- I'm unsure which solution is the best, but I would be inclined to say that ExprValue should be cloneable. Anyways, whoever picks this up can explore both avenues.

## Additional Context
- Java version: NA
- PartiQL version: 0.7+ main branch
johnedquinn commented 1 year ago
Screen Shot 2022-10-07 at 11 27 08 AM

For reference, the above image shows the work-around for performing a half-clone of an ExprValue. Realistically, we probably need to implement clone().

johnedquinn commented 1 year ago

BTW, in #821 , I added failing tests to both the EvaluatingCompilerGroupByTests and EvaluatingCompilerOrderByTests. We should strive to make ExprValue cloneable and then pass those tests.

lziq commented 1 year ago

When kotlin sequence computation involves non-local states, it can be buggy. As the following shows:

    val source = listOf(1, 2)
    val encountered = mutableSetOf<Int>()

    val value = sequence {
        source.forEach {
            if (!encountered.contains(it)) {
                encountered.add(it)
                yield(it)
            }
        }
    }

    println(value.count())
    println(value.count()) // This go through all the computation again. However, at this time, `encountered` is already updated with the previous call, so this line gives us a completely different result 

With result to be

2
0
lziq commented 1 year ago

PR https://github.com/partiql/partiql-lang-kotlin/pull/842 resolves the problem with subqueries. However, I just experienced the same with another query, which is SELECT VALUE [v] FROM <<11,12>> AS v.

To reproduce:

val result = compile("SELECT VALUE [v] FROM <<11,12>> AS v")

val it = result.iterator()
val item1 = it.next()
println(item1)

val item2 = it.next()
println(item1)
println(item2)

With printed result to be:

[11]
[12]
[12]

The compile method is shown as follow:

fun compile(query: String): ExprValue {

    val evaluatorOptions = org.partiql.lang.planner.EvaluatorOptions.builder().typingMode(TypingMode.PERMISSIVE).build()
    val session = EvaluationSession.standard()
    val ION = IonSystemBuilder.standard().build()

    val globalVariableResolver = org.partiql.lang.planner.GlobalVariableResolver {
        val value = session.globals[it]
        if (value != null) {
            org.partiql.lang.planner.GlobalResolutionResult.GlobalVariable(it.name)
        } else {
            org.partiql.lang.planner.GlobalResolutionResult.Undefined
        }
    }

    val plannerOptions = org.partiql.lang.planner.PartiQLPlanner.Options(
        allowedUndefinedVariables = true
    )

    val pipeline = org.partiql.lang.compiler.PartiQLCompilerPipeline(
        parser = org.partiql.lang.syntax.PartiQLParserBuilder().ionSystem(ION).build(),
        planner = org.partiql.lang.planner.PartiQLPlannerBuilder.standard()
            .options(plannerOptions)
            .globalVariableResolver(globalVariableResolver)
            .build(),
        compiler = org.partiql.lang.compiler.PartiQLCompilerBuilder.standard()
            .options(evaluatorOptions)
            .build(),
    )

    val statement = pipeline.compile(query)
    return when (val result = statement.eval(session)) {
        is org.partiql.lang.eval.PartiQLResult.Delete,
        is org.partiql.lang.eval.PartiQLResult.Replace,
        is org.partiql.lang.eval.PartiQLResult.Insert -> error("DML is not supported by test suite")
        is org.partiql.lang.eval.PartiQLResult.Value -> result.value
    }
}