pgspider / sqlite_fdw

SQLite Foreign Data Wrapper for PostgreSQL
Other
212 stars 37 forks source link

NaN values converted to 0 #36

Closed Soni-Harriz closed 2 months ago

Soni-Harriz commented 3 years ago

In sqlite we had real data type, and store "-nan(ind)" values, I believe that it is a NaN values. But it gets converted to 0 on postgres. I believe it should be NaN also in postgres.

NaN values in sqlite real also converted to 0, and any text on sqlite real is converted to 0 also

t-kataym commented 3 years ago

Thank you for reporting.

I would like to understand the situation well and to know SQL to reproduce it. Could you tell me followings?

mkgrgis commented 1 year ago

@Soni-Harriz, any SQL?

mkgrgis commented 1 year ago

In PostgreSQL there is correct SQL

SELECT FLOAT8 '+infinity' "+∞", FLOAT8 '-infinity' "-∞", FLOAT8 'NaN' "NaN";

OK, there is support of all special values. Let's test SQlite.

create table "Float" ( n real);
insert into "Float" values ('-Infinity'), ('+Infinity'), ('NaN'), (2), (0), (5);
select * from "Float";

No problems.

What about sqlite_fdw?

CREATE FOREIGN TABLE public."Float" (
    "n" float8,
    "N" double precision    
)
SERVER sqlite_server;

SELECT * FROM public."Float";

What is result? invalid input syntax for type =2, column type =3 near https://github.com/pgspider/sqlite_fdw/blob/3a495b79bdedab4fdbcbacea452deb8599196341/sqlite_query.c#L93 It seems no support for special values in sqlite_fdw.

t-kataym commented 1 year ago

SQLite does not have a concept of 'Infinity' and 'NaN' as a number.
You can insert these data but they are just treated as text value in SQLite.

sqlite> select 1 > 'Infinity';
0
sqlite> select 1 > '-Infinity';
0

SQLite can have different types of data in a column.

sqlite> CREATE TABLE t1(a NUMERIC);
sqlite> INSERT INTO t1 VALUES(1);
sqlite> INSERT INTO t1 VALUES('t');
sqlite> SELECT * FROM t1;
1
t

But PostgreSQL cannot. So SQLite FDW expects that data type of record is same as column type in PostgreSQL.

mkgrgis commented 1 year ago

SQLite does not have a concept of 'Infinity' and 'NaN' as a number.

Let's test select typeof(n),* from "Float"; from previous example.

typeof(n) n
text NaN
real 2.0
real 0.0
real 5.0
text -infinity
text +infinity
text -Infinity
text +Infinity

Yes, infinity values is not real by affinity. But the second fact is PostgreSQL can convert '+infinity' to float value.

Let's check with SQLite strict tables on https://sqlite.org/fiddle/.

-- Initial SQL, no problems.
create table "Float strict" ( n real) strict;
insert into "Float strict" values (2);
insert into "Float strict" values (0.0);
insert into "Float strict" values (5.1);

What about this queries?

insert into "Float strict" values ('-Infinity');
insert into "Float strict" values ('+Infinity');
insert into "Float strict" values ('NaN');

In all cases we have Runtime error near line 10: cannot store TEXT value in REAL column Float strict.n (19). In fact there is no special non-numeric values in SQlite for real affinity.

So @Soni-Harriz wasn't right writing

In sqlite we had real data type, and store "-nan(ind)" values, I believe that it is a NaN values. But it gets converted to 0 on postgres. I believe it should be NaN also in postgres.

I don't know if there is sense to check special non-numeric values in sqlite_fdw.

mkgrgis commented 1 year ago

What is sqlite_fdw? I think it's special SQLite client as data input for PostgreSQL. Hence PostgreSQL behaviour with input data is primary. For PostgreSQL data types from real family string literals such as '+infinity', '-infinity', 'NaN' isn't incorrect input, hence there is sense to support this literals even if there is no special values for real in SQLite. I think SQLite STRICT mode will not main practice during long times. So, exception for special literals can be coded.

mkgrgis commented 1 year ago

Fixed in https://github.com/pgspider/sqlite_fdw/pull/57. It wasn't very hard to find lines of code for fixing this problem.

mkgrgis commented 10 months ago

SQLite does not have a concept of 'Infinity' and 'NaN' as a number.

@t-kataym your info is outdated. In SQLite there is native ∞ conception as special values with real affinity. Please SELECT 9e999; or insert this value to a column. Also this works in STRICT mode because of real affinity. Deparsing PostgreSQL Infinity to SQLite 9e999 gives excellent ISO:SQL results both on SQLite side and PostgreSQL side. I am preparing PR with full ∞ support including ISO:SQL behaviour on PostgreSQL side around of Infinity value with text SQLite affinity. This was not hard.

More harder is NaN conception. Now we have no NaN storage conception in SQLite, but during data processing C function sqlite3_result_double(context, 0.0/0.0); gives SQL NULL value. Look like my PR will be ready next week (after yet ready for review bool mixed affinity support). I hope we continue to discuss about NaN near of a new PR, but now I have important notice about not ISO:SQL behaviour in tests https://github.com/pgspider/sqlite_fdw/blob/d6387a8ea33129d69c45389ee20180d809a03f03/expected/16.0/extra/aggregates.out#L521

The tests gives 0 , but PostgreSQL and ISO:SQL behaviour is NaN, see PostgreSQL SELECT sum('NaN'::float8) from information_schema.columns;

@Soni-Harriz mistakes about

In sqlite we had real data type, and store "-nan(ind)" values, I believe that it is a NaN values. But it gets converted to 0 on postgres.

Look like here is description of some sum problem, because this function really gets 0 with NaN arguments in FDW SQL context like in tests. What do you think about, @t-kataym ?

mkgrgis commented 10 months ago

@Soni-Harriz , could you give us please some fragments of your testcase with unwanted for You sqlite_fdw behaviour? Now I am coding about NaN and this test case will be interesting for me.

mkgrgis commented 10 months ago

@t-kataym , in current tests there is aggregates with NaN and expected 0 addition to sum for example. If we treat NaN from PostgreSQL as NULL sum will gives NULL because one of members is NULL. Which behaviour is better? In PostgreSQL we have not 0.0 in the nearest case.

pg利用者=> create table x (f float8);
CREATE TABLE
pg利用者=> insert into x values (2), (3), ('NaN');
INSERT 0 3
pg利用者=> select sum(f) from x;
 sum 
-----
 NaN
(1 行)
t-kataym commented 9 months ago

I think sqlite_fdw does not need to treat NaN value when it is fetched from SQLite table because SQLite treats :nan as NULL.

The behavior on SQLite:

sqlite> CREATE TABLE t1 (f REAL);
sqlite> INSERT INTO t1 VALUES (2), (3), (:nan);
sqlite> SELECT sum(f) FROM t1;
5
sqlite> SELECT f FROM t1;
2.0
3.0
NULL

So sum(f) on PostgreSQL via sqlite_fdw returns 5.0 as same as a normal table on PostgreSQL.

mkgrgis commented 9 months ago

After your message, @t-kataym , i have understand that ±∞ and NaN should be different PRs for float-like datatypes in our FDW. PR about ∞ will be simpler by conception but harder by tests. PR about NaN will be simpler by C-code and tests but harder by conception. Anyway there is possible fact: NaN value in INSERT from sqlite_fdw. This can be converted to NULL or text-affinity 'NaN'. Both possibilities is not ISO:SQL and have unwanted effects. So we'll continue discussing around of NaN near one of future PR.

mkgrgis commented 3 months ago

Hello @t-kataym ! After ±∞ support draft in https://github.com/pgspider/sqlite_fdw/pull/97 I can list all NaN processing problems in the FDW.

Main source of our problems is SQLite sqlite3VdbeMemSetDouble where there is if( !sqlite3IsNaN(val)).

This is not ISO:SQL behavior and we have no chances to use SQLite data in PostgreSQL style. This usage is yet implemented for boolean special values or for ∞ special values or as for mixed affinity UUIDs or for other cases.

According ISO:SQL all values should be

We absolutely cannot implement all of this properties for NaN at the same time. Current implementation deparses NaN to SQLite as value with text affinity. https://github.com/pgspider/sqlite_fdw/blob/cdcf0baffef09298604a9678f6ad72f40bc722d4/deparse.c#L2669 Look like this code was borrowed from postgres_fdw and not tested. NaN value with text affinity is not readable and filterable in current implementation and was not readable before my PRs! https://github.com/pgspider/sqlite_fdw/blob/cdcf0baffef09298604a9678f6ad72f40bc722d4/sqlite_query.c#L268

Possible alternatives:

  1. Readable and writable NaN with text affinity, but not filterable (no filter predicate pushdowning)
  2. Writable and filterable NaN as NULL but not readable (no difference with real NULL)
  3. Something other?

@ALL , which NaN implementation does you recommended for future PR? All tests yet have written but no optimal result understanding now.

mkgrgis commented 3 months ago

Special branch with NaN processing tests added. Now NaN is unfiltrable/undetectable:

--Testcase 329:
SELECT * FROM "type_FLOAT_INF+" WHERE f = 'NaN' ORDER BY i;
 i | f | t | l 
---+---+---+---
(0 行)
t-kataym commented 2 months ago

https://github.com/pgspider/sqlite_fdw/pull/100#issuecomment-2175797564 PR #100 was merged. I close this PR.