Open kagamiori opened 2 months ago
cc @kevinwilfong @DanielHunte
I think Daniel is planning to revert the change, it generally results in inconsistent ordering of the result arrays.
I think Daniel is planning to revert the change, it generally results in inconsistent ordering of the result arrays.
@kevinwilfong Got it. Curious what other problem does this inconsistent ordering cause (besides fuzzer)?
While I was going through another part of the code for array_intersect I noticed a comment that mentioned we only optimize for ConstantVectors if they're in the right hand side argument.
I can see how it might surprise users, if they call array_intersect and the order of the result arrays depends on the encoding of the argument Vectors or the size of other arrays in the same batch, it would seem pretty random to them, and given that someone took the time to leave that comment, it seems someone else felt that way and decided it was more important than performance.
Description
It seems that after the fix https://github.com/facebookincubator/velox/pull/10624 (context in https://github.com/facebookincubator/velox/issues/10561), array_intersect now produces result arrays with different element orders depending on the encoding of the input vectors. Since the simplified evaluator always flatten the input vectors first, the simplified evaluation result can have different element order from the common evaluation result when the first argument is a constant literal.
We would need to add the support for custom result verifier in expression fuzzer that sort the elements before comparing the result arrays from the common evaluator and simplified evaluator.
Error Reproduction
A fuzzer failure can be reproduced via presto-fuzzer-failure-artifacts.
Relevant logs