apache / incubator-gluten

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
https://gluten.apache.org/
Apache License 2.0
1.22k stars 437 forks source link

[CH] To_date diff while set `spark.sql.legacy.timeParserPolicy` different value #7896

Open KevinyhZou opened 1 week ago

KevinyhZou commented 1 week ago

Backend

CH (ClickHouse)

Bug description

sql query

select to_date('2025-07-22 10:00:00', 'yyyy-MM-dd')

in valina spark, when set spark.sql.legacy.timeParserPolicy value as legacy, it returns '2025-07-22'; when set value corrected it returns value NULL

in gluten,it always returns '2025-07-22'

Spark version

Spark-3.3.x

Spark configurations

No response

System information

No response

Relevant logs

No response

KevinyhZou commented 4 days ago

In query select to_date('2025-07-22 10:00:00', 'yyyy-MM-dd'), it transform to get_timestamp function to parse date.

when spark.sql.legacy.timeParserPolicy set to legacy,

  1. if format string(like: yyyy-MM-dd) length is less than the length of yyyy-MM-dd HH:mm:ss, it should parse the substring of the input string according to the format length;
  2. if format string's(like: yyyy-MM-dd HH:mm:ss.S) length is greater than the length of yyyy-MM-dd HH:mm:ss, it shoud always to parse the timestamp as the micorseond's the precision set to 3.

if spark.sql.legacy.timeParserPolicy set to corrected, then parse the date time as the input format.