EnterpriseDB / mysql_fdw

PostgreSQL foreign data wrapper for MySQL
Other
521 stars 160 forks source link

Truncating casts are not pushed down correctly #298

Open smilingthax opened 3 months ago

smilingthax commented 3 months ago
  1. mysql table with val float column, two rows with values 0 and 0.1.
  2. import as mysql_fdw foreign table into postgres, type of val becomes real
  3. (default) mysql_fdw_pushdown.config allows ROUTINE pg_catalog.int4(real) to be pushed down
  4. Tested w/ mariadb 10.11.2, postgres 16.2, mysql_fdw 2.9.1-2.pgdg120+1
pg# SELECT val::int4::float FROM ext_table;  -- correct, same for just val::int4 and int4(val)
 val
-----
   0
   0
(2 rows)

pg# EXPLAIN VERBOSE SELECT val::int4::float FROM ext_table;  -- somewhat suspicious
                             QUERY PLAN
---------------------------------------------------
 Foreign Scan on s.ext_table
   Output: ((val)::integer)::double precision
   Remote query: SELECT `val` FROM `s`.`t`

pg# SELECT val::int4 FROM ext_table GROUP BY 1;  -- returned output not parseable by pg ?
ERROR:  invalid input syntax for type integer: "0.1"

pg# EXPLAIN VERBOSE SELECT val::int4 FROM ext_table GROUP BY 1;  -- also: wrong grouping
                               QUERY PLAN
------------------------------------------------------------------
 Foreign Scan
   Output: ((val)::integer)
   Relations: Aggregate on (s.t)
   Remote query: SELECT `val` FROM `s`.`t` GROUP BY 1
(5 rows)

pg# SELECT val::int4::float FROM ext_table GROUP BY 1;  -- parseable, but wrong grouping result
 val
-----
   0
 0.1
(2 rows)

(for comparison:
mysql> SELECT CAST(CAST(val AS int4) AS float) FROM t GROUP BY 1;
+----------------------------------+
| CAST(CAST(val AS int4) AS float) |
+----------------------------------+
|                                0 |
+----------------------------------+
1 row in set (0.001 sec)  )

pg# SELECT int4(val) FROM ext_table GROUP BY 1;   -- same w/  EXPLAIN VERBOSE: int4 only exists as cast, not as function in mysql
ERROR:  failed to prepare the MySQL query:
You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'int4(`val`) FROM `s`.`t` GROUP BY 1' at line 1

(My original encounter with this issue was with ::date, resp. CAST(... AS date), when I tried to add

ROUTINE pg_catalog.date(timestamp without time zone)

to mysql_fdw_pushdown.config to allow it to be pushed down – w/o this entry, the grouping was (correctly) not pushed down, but performed by postgres. With the entry, the cast would be (wrongly) pushed down only partially, as described above. Using date(...) instead pushes both that and the GROUP BY down correctly then (date also exists as mysql function, where int4 doesn't) ...)

surajkharage19 commented 3 months ago

Hi @smilingthax,

If I understand your issue correctly, you are trying to say that cast operations should push down to the remote server to get the correct result.

So, considering this PG query -

SELECT val::int4::float FROM ext_table GROUP BY 1;

should be deparsed as

 SELECT CAST(CAST(val AS int4) AS float) FROM t GROUP BY 1;

to get the correct result?

I executed the above query on MySQL and got syntax error:

mysql> SELECT CAST(CAST(val AS int4) AS float) FROM real_test GROUP BY 1;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'int4) AS float) FROM real_test GROUP BY 1' at line 1
smilingthax commented 3 months ago

If I understand your issue correctly, you are trying to say that cast operations should push down to the remote server to get the correct result.

  1. Basically yes.

AFAIUI, mysql_fdw decides whether it can "handle" a certain operation, otherwise postgres will execute it locally.
In the current implementation, mysql_fdw says: "yes, I can handle casts", but does this not by pushing them down, but by somehow 'executing' them "after reading the result from mysql, but before postgres uses the data".

This is problematic, because (e.g.) truncating cast + group by cannot always "simply be done" after reading a result from mysql. Also, currently the expected (integer) vs. obtained values (potentially "0.1") disagree, which leads to the invalid input syntax for type integer: "0.1".

Either mysql_fdw MUST NOT claim to be able to handle them in non-trivial cases (and let postgres do the correct thing instead),
or it has to implement it by actually pushing the casts to the mysql server.

So, considering this PG query -

SELECT val::int4::float FROM ext_table GROUP BY 1;

should be deparsed as

 SELECT CAST(CAST(val AS int4) AS float) FROM t GROUP BY 1;

to get the correct result?

This would do the correct thing here, yes.

I executed the above query on MySQL and got syntax error:

mysql> SELECT CAST(CAST(val AS int4) AS float) FROM real_test GROUP BY 1;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'int4) AS float) FROM real_test GROUP BY 1' at line 1

The statement does not error on MariaDB 10.11.2;
I can confirm, though, that at least MariaDB 10.3.12 has problems with using float in the cast (and also certain other "type names", AFAICT... real, anyone?).

Replacing float with double does the trick there: (note that double is not a valid type in postgres, it would be float8)

MariaDB> SELECT CAST(CAST(val AS int4) AS double) FROM real_test GROUP BY 1;

I'm not sure, if that's enough to formulate a fixed postgres->mysql type mapping which works for as many mysql/mariadb versions as possible...?

  1. int4 only exists as cast, not as function in mysql

That's a separate issue: Currently, casts and function calls are treated as "the same" thing (ROUTINE) wrt. to push-down.
But in reality:

Therefore mysql_fdw_pushdown.config should most certainly distinguish these cases, e.g. by introducing a new keyword CAST (which types as cast target can be pushed down) or/and maybe TYPE (which postgres type name can/may be mapped to which mysql type name?).