Kyligence / ClickHouse

ClickHouse® is a free analytics DBMS for big data
https://clickhouse.com
Apache License 2.0
14 stars 17 forks source link

Read null partition value problem #426

Open lhuang09287750 opened 1 year ago

lhuang09287750 commented 1 year ago

Describe the unexpected behaviour tpcds q99 result is not the same between spark and spark+ch backend

How to reproduce test("TPCDS Q99") { runTPCDSQuery("q99") { df => } // Thread.sleep(1000000) } add this to GlutenClickHouseTPCDSParquetSuite to reproduce

Expected behavior cs_sold_date_sk is a partition column of table catalog_sales. when cs_sold_date_sk has a null partition value(give some data in dir of 'cs_sold_date_sk=HIVE_DEFAULT_PARTITION') spark+ch backend will read the null as the default value of cs_sold_date_sk's column type which is zero in q99's case. This behavior is differrent from spark which will read the null just as null

lhuang09287750 commented 1 year ago

https://github.com/Kyligence/ClickHouse/pull/427, that is the pr