EnterpriseDB / mysql_fdw

PostgreSQL foreign data wrapper for MySQL
Other
532 stars 163 forks source link

sql_mode setting can cause incorrect results #163

Open mr-russ opened 6 years ago

mr-russ commented 6 years ago

Reproduced example;

Current Behaviour GIVEN mysql_fdw.sql SETUP from this project WHEN select * from department WHERE department_name || 'xx' LIKE '%xx'; THEN 0 rows are returned

Expected Behaviour GIVEN mysql_fdw.sql SETUP from this project WHEN select * from department WHERE department_name || 'xx' LIKE '%xx'; THEN 100 rows are returned

This is caused by || being interpreted as OR and a boolean being false SELECT * FROM department WHERE false LIKE '%xx';

mysql_fdw.c:447,813,1499 all do: "SET sql_mode='ANSI_QUOTES'"

I have seen this is a real query, which is how I created the simple case even if it seems trivial or nonsensical.

In MySQL, or at least MariaDB 10.1.26-MariaDB-0+deb9u1 this is a replacement query, not an additive query. So even though I have PIPE_AS_CONCAT set on the server, it's dropped for the foreign table queries as it's overridden by the connection.

The following blog post has information on how to add and remove individual elements from the compatibility list. I don't know if that is what's needed here as I don't understand how the remote SQL is formed and what SQL modes it needs to operate and what surprises you will get if you change them. I have however had to research the following when changing SQL_MODES when using partitioned tables.

https://dev.mysql.com/doc/mysql-partitioning-excerpt/5.7/en/partitioning-limitations.html can be affected by wiping out some modes. ERROR_FOR_DIVISION_BY_ZERO and NO_UNSIGNED_SUBTRACTION are sighted as two examples where partitioning breaks. Users with those settings as non-default will experience issues when using partitioned tables.

I sadly don't have the required experience to submit a patch for this that I could test, however the solution appears to be something like;

  1. Add ANSI_QUOTES (which aren't used in the remote queries I have seen in explain analyze)
  2. Add PIPE_AS_CONCAT
  3. Review if PAD_CHAR_TO_FULL_LENGTH is needed

Other modes may be required, but can't really be changed when they could affect partitioning, ALLOW_INVALID_DATES is an example. The above 3 are partitioning safe from my reading of the mysql code with just does maths calculations for calculating partitions.

I know that doing 1-3 as mentioned will remove a number of cases and keep the SQL_MODE as close as possible to the servers.

If you have any questions, please let me know

ahsanhadi commented 6 years ago

you said

Current Behaviour GIVEN mysql_fdw.sql SETUP from this project WHEN select * from department WHERE department_name || 'xx' LIKE '%xx'; THEN 100 rows are returned

Expected Behaviour GIVEN mysql_fdw.sql SETUP from this project WHEN select * from department WHERE department_name || 'xx' LIKE '%xx'; THEN 100 rows are returned

What is the difference between the two?

I believe what you are saying is that II sign in a query is not getting deparsed correctly with mysql_fdw and hence it is resulting in query returning incorrect results?

Can you confirm?

mr-russ commented 6 years ago

Well, That's a little embarrassing. I re-read that submission multiple times. The current behaviour returns "0" rows.

  1. What you say is correct.
  2. I am also saying that SQL_MODE that exists in MySQL server configuration are ignored and completely overwritten by mysql_fdw. This is some circumstances can result in data corruption.