apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
5.93k stars 1.12k forks source link

`array_union` and `array_intersect` cannot handle NULL columnar data #9706

Open Weijun-H opened 6 months ago

Weijun-H commented 6 months ago

Describe the bug

It seems that the functions array_union and array_intersect are unable to process columnar data that contains NULL values. These NULL values are being overlooked in the process.

To Reproduce

❯ 
CREATE TABLE array_intersect_table
AS VALUES
  ([1, 2, 2, 3], [2, 3, 4]),
  ([2, 3, 3], [3]),
  ([3], [3, 3, 4]),
  (null, [3, 4]),
  ([1, 2], null),
  (null, null)
;
0 rows in set. Query took 0.013 seconds.

❯ select array_intersect(column1, column2) from array_intersect_table;
+------------------------------------------------------------------------+
| array_intersect(array_intersect_table.column1,array_except_table.column2) |
+------------------------------------------------------------------------+
| [2, 3]                                                                 |
| [3]                                                                    |
| [3]                                                                    |
+------------------------------------------------------------------------+
3 rows in set. Query took 0.007 seconds.

Expected behavior

No response

Additional context

No response

Weijun-H commented 6 months ago

take