apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
5.21k stars 957 forks source link

`LogFunc` simplifier swaps the order of arguments #10359

Closed erratic-pattern closed 2 weeks ago

erratic-pattern commented 2 weeks ago

Describe the bug

The LogFunc::simplify function swaps the order of arguments.

To Reproduce

I have created a test case that can reproduce the issue:

    #[test]
    // Test that non-simplifiable log() expressions are unchanged after simplification
    fn test_log_simplify_original() {
        let props = ExecutionProps::new();
        let schema = Arc::new(DFSchema::new_with_metadata(vec![], HashMap::new()).unwrap());
        let context = SimplifyContext::new(&props).with_schema(schema);
        // One argument with no simplifications 
        let result = LogFunc::new()
            .simplify(vec![lit(2)], &context)
            .unwrap();
        match result {
            ExprSimplifyResult::Simplified(_) => panic!("Expected ExprSimplifyResult::Original"),
            ExprSimplifyResult::Original(args) => {
                assert_eq!(args.len(), 1);
                assert_eq!(args[0], lit(2));
            }
        }
        // Two arguments with no simplifications
        let result = LogFunc::new()
            .simplify(vec![lit(2), lit(3)], &context)
            .unwrap();
        match result {
            ExprSimplifyResult::Simplified(_) => panic!("Expected ExprSimplifyResult::Original"),
            ExprSimplifyResult::Original(args) => {
                assert_eq!(args.len(), 2);
                assert_eq!(args[0], lit(2));
                assert_eq!(args[1], lit(3));
            }
        }
    }

test results:

---- math::log::tests::test_log_simplify_original stdout ----
thread 'math::log::tests::test_log_simplify_original' panicked at datafusion/functions/src/math/log.rs:330:17:
assertion `left == right` failed
  left: Literal(Int32(3))
 right: Literal(Int32(2))
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Expected behavior

Arguments should remain in correct order of log(base, number)

Additional context

This bug seems to be hidden by the fact that the expression simplifier runs twice. In most cases this means the arguments will swap back into place, but it may be possible to craft expressions which only run simplify once.

While working on PR https://github.com/apache/datafusion/pull/10358 to run the simplifer in a loop, I began to encounter this bug in the case of 3 iterations.