apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
5.48k stars 1.01k forks source link

Comparison of different ScalarValue types produces the silent errors #3511

Open metesynnada opened 1 year ago

metesynnada commented 1 year ago

Describe the bug

PartialOrd implementation of ScalarValue types gives None for different types and it is automatically cast to false. I think this is the wrong behavior and introduces silent errors.

To Reproduce

Below test code shows the error

fn ord_diff_type() {
    assert_eq!(
        true,
        (ScalarValue::Float32(Some(2.0)) < ScalarValue::Int32(Some(3)))
    );
}

Expected behavior

I expected comparison to be true, however comparison (ScalarValue::Float32(Some(2.0)) < ScalarValue::Int32(Some(3))) returns false. Hence assertion fails. I think returning false here is misleading. This comparison either should produce an error or return true. If we do the comparison below it works as expected., since they have the same type

fn ord_same_type() {
    assert_eq!(
        true,
            (ScalarValue::Int32(Some(2))<
            ScalarValue::Int32(Some(3)))
    );
}

Additional context

I think casting the left or right side of the comparison to the larger type would solve the problem. The relevant section of the code is below https://github.com/apache/arrow-datafusion/blob/66dd253d2e0cdd00c8d3611f2ca470b9cde48abc/datafusion/common/src/scalar.rs#L196

HaoYang670 commented 1 year ago

This comparison either should produce an error or return true.

I prefer returning an error because the casting should have been done in the type_coercion stage.