Describe the unexpected behaviour
tpcds q99 result is not the same between spark and spark+ch backend
How to reproducetest("TPCDS Q99") { runTPCDSQuery("q99") { df => } // Thread.sleep(1000000) }
add this to GlutenClickHouseTPCDSParquetSuite to reproduce
Expected behavior
cs_sold_date_sk is a partition column of table catalog_sales.
when cs_sold_date_sk has a null partition value(give some data in dir of 'cs_sold_date_sk=HIVE_DEFAULT_PARTITION')
spark+ch backend will read the null as the default value of cs_sold_date_sk's column type which is zero in q99's case.
This behavior is differrent from spark which will read the null just as null
Describe the unexpected behaviour tpcds q99 result is not the same between spark and spark+ch backend
How to reproduce
test("TPCDS Q99") { runTPCDSQuery("q99") { df => } // Thread.sleep(1000000) }
add this to GlutenClickHouseTPCDSParquetSuite to reproduceExpected behavior cs_sold_date_sk is a partition column of table catalog_sales. when cs_sold_date_sk has a null partition value(give some data in dir of 'cs_sold_date_sk=HIVE_DEFAULT_PARTITION') spark+ch backend will read the null as the default value of cs_sold_date_sk's column type which is zero in q99's case. This behavior is differrent from spark which will read the null just as null