sunng87 / datafusion-postgres

Serving any JSON/CSN/Parquet/Arrow files like Postgres tables with Datafusion
Apache License 2.0
22 stars 3 forks source link

NULLs are not returned in a list #5

Closed mdwaud closed 6 months ago

mdwaud commented 6 months ago

I have a list where some elements may be NULL. Unfortunately they aren't showing up in the return, for example:

In postgres

postgres=# SELECT array[1,NULL,3];
   array    
------------
 {1,NULL,3}
(1 row)

In datafusion

datafusion=> SELECT array[1,NULL,3];
 make_array(Int64(1),NULL,Int64(3)) 
------------------------------------
 [1,,3]
(1 row)
mdwaud commented 6 months ago

Hmmm... array_length is telling me it's there, though:

In postgres

postgres=# SELECT ARRAY_LENGTH(ARRAY[1, NULL], 1);
 array_length 
--------------
            2
(1 row)

In datafusion-postgres

datafusion=>  SELECT ARRAY_LENGTH(ARRAY[1, NULL], 1);
 array_length(make_array(Int64(1),NULL),Int64(1)) 
--------------------------------------------------
                                                2
(1 row)

Let me see if I can come up with a better example to reproduce what I'm seeing. For context, I have a column in data fusion where a value may be None. In psql I am seeing an empty list when the value is None (instead of [None]). It's possible there's an error in my setup.

sunng87 commented 6 months ago

Yeah, I think it's just how pgwire serialize NULL value. It should be text NULL instead of empty string.

sunng87 commented 6 months ago

Just learned that postgres only uses NULL for NULL value in array (and other compound data strcutures maybe)

mdwaud commented 6 months ago

🎉 I can confirm that https://github.com/sunng87/pgwire/pull/174 fixed the issue. The reason I found this is that I'm using sqlx in a test suite on the output of a query.

One other thing I noticed is that booleans appear to have a similar issue. Ex:

In postgres

postgres=# SELECT false;
 bool 
------
 f
(1 row)

In datafusion-postgres (pgwire)

datafusion=> SELECT false;
 Boolean(false) 
----------------
 false
(1 row)

My test is failing because sqlx only accepts t / f (https://github.com/launchbadge/sqlx/blob/main/sqlx-postgres/src/types/bool.rs), however it's unclear what the official postgres spec is on this.

sunng87 commented 6 months ago

Thank you for pointing out. I added another fix for to_sql_text for bool.

sunng87 commented 6 months ago

I just released a new version of pgwire to include this fix. You can also test latest HEAD from this repo

mdwaud commented 6 months ago

Thanks for the quick response!