brianmario / mysql2

A modern, simple and very fast Mysql library for Ruby - binding to libmysql
http://github.com/brianmario/mysql2
MIT License
2.24k stars 550 forks source link

"`new': invalid date (ArgumentError)" raised when use prepare statement with UNION. #963

Open YukiJikumaru opened 6 years ago

YukiJikumaru commented 6 years ago

WHen I use Mysql2::Statement#execute with "UNION", invalid date (ArgumentError) is raised.

environment and version

[jiku@centos7 work]$ uname -a
Linux centos7 3.10.0-514.el7.x86_64 #1 SMP Tue Nov 22 16:42:41 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
[jiku@centos7 work]$ ruby -v
ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-linux]
[jiku@centos7 work]$ bundle exec ruby -e 'require "mysql2"; puts Mysql2::VERSION'
0.5.1

source code

require 'mysql2'

puts Mysql2::VERSION # => 0.5.1

client = Mysql2::Client.new(
  as: :array,
  encoding: 'utf8mb4',
  collation: 'utf8mb4_general_ci',
  username: 'root',
  password: "PASS",
  host: nil,
  database: 'foo',
  socket: '/var/lib/mysql/mysql.sock'
)

# PREPARE TABLES
client.query('DROP TABLE IF EXISTS foo1;')
client.query('CREATE TABLE `foo1` (`id` int(11) unsigned NOT NULL, `ymd` date NOT NULL);')
client.query('DROP TABLE IF EXISTS foo2;')
client.query('CREATE TABLE `foo2` (`id` int(11) unsigned NOT NULL, `ymd` date NOT NULL);')

# PREPARE DATA (If no data this error is not raised.)
client.query("INSERT INTO `foo1` (`id`, `ymd`) VALUES (1, '2018-01-01');")
client.query("INSERT INTO `foo2` (`id`, `ymd`) VALUES (1, '2018-02-01');")

sql = <<SQL
SELECT
  foo1.ymd
FROM foo1
UNION ALL
SELECT
  foo2.ymd
FROM foo2
SQL

statement = client.prepare(sql)
result = statement.execute # RAISE ERROR

# If use Mysql2::Client#query. This does not raise error.
# result = client.query(sql)

expected behavior

No error should be raised, and Array of Date should be returned as like client.query(sql) .

actual behavior

[jiku@centos7 work]$ bundle exec ruby invalid_date.rb                                                                                                                                                          
0.5.1
Traceback (most recent call last):
        6: from invalid_date.rb:38:in `<main>'
        5: from /home/jiku/work/vendor/bundle/ruby/2.5.0/gems/mysql2-0.5.1/lib/mysql2/statement.rb:6:in `execute'
        4: from /home/jiku/work/vendor/bundle/ruby/2.5.0/gems/mysql2-0.5.1/lib/mysql2/statement.rb:6:in `handle_interrupt'
        3: from /home/jiku/work/vendor/bundle/ruby/2.5.0/gems/mysql2-0.5.1/lib/mysql2/statement.rb:7:in `block in execute'
        2: from /home/jiku/work/vendor/bundle/ruby/2.5.0/gems/mysql2-0.5.1/lib/mysql2/statement.rb:7:in `_execute'
        1: from /home/jiku/work/vendor/bundle/ruby/2.5.0/gems/mysql2-0.5.1/lib/mysql2/statement.rb:7:in `each'
/home/jiku/work/vendor/bundle/ruby/2.5.0/gems/mysql2-0.5.1/lib/mysql2/statement.rb:7:in `new': invalid date (ArgumentError)
sodabrew commented 6 years ago

I found this: https://dba.stackexchange.com/questions/163489/mysql-union-all-causing-date-field-to-lose-format

Does this help?

sql = <<SQL
(SELECT ymd FROM foo1)
UNION ALL
(SELECT ymd FROM foo2)
SQL

The parentheses become mandatory in MySQL 5.7 and above - are you using 5.6 for this test?

The table names are superfluous within each query, so I am hoping that by removing the table names, there is a matching correspondence of column name in each query. This may or may not matter, it was not clear to me from the MySQL docs if the rows are aligned by name or strictly by order. I'll see about reproducing on my test machine later on.

YukiJikumaru commented 6 years ago

@sodabrew Thank you for your helping!

Does this help?

sql = <<SQL
(SELECT ymd FROM foo1)
UNION ALL
(SELECT ymd FROM foo2)
SQL

I tried the above DML. Unfortunately Identical error happended šŸ˜¢

The parentheses become mandatory in MySQL 5.7 and above - are you using 5.6 for this test?

I'm using MySQL 5.7.17, settings below.

[jiku@centos7 work]$ mysql --version
mysql  Ver 14.14 Distrib 5.7.17, for Linux (x86_64) using  EditLine wrapper
[jiku@centos7 work]$ rpm -qa | grep mysql
mysql-community-server-5.7.17-1.el7.x86_64
mysql-community-common-5.7.17-1.el7.x86_64
mysql-community-libs-5.7.17-1.el7.x86_64
mysql-community-libs-compat-5.7.17-1.el7.x86_64
mysql-community-client-5.7.17-1.el7.x86_64
mysql-community-devel-5.7.17-1.el7.x86_64
mysql> show variables like 'sql_mode';
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------+
| Variable_name | Value                                                                                                                                     |
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------+
| sql_mode      | ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION |
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

ā†‘ Complete info is attached as "mysql_variables.txt".

/etc/my.cnf

[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock

symbolic-links=0

pid-file=/var/run/mysqld/mysqld.pid

character-set-server=utf8mb4
collation-server=utf8mb4_general_ci 

# Log
general_log_file=/var/log/mysqld_general.log
general_log=0
log-error=/var/log/mysqld_error.log
slow_query_log=1
slow_query_log_file=/var/log/mysqld_slow.log
long_query_time=1

# InnoDB
innodb_file_per_table=1
innodb_file_format=Barracuda
innodb_flush_method = O_DIRECT

[client]
port=3306
socket=/var/lib/mysql/mysql.sock
default-character-set=utf8mb4
password="PASS"

Then I tried 2 DMLs on MySQL CLI, both succeeded.

mysql> (SELECT ymd FROM foo1)
    -> UNION ALL
    -> (SELECT ymd FROM foo2);
+------------+
| ymd        |
+------------+
| 2018-01-01 |
| 2018-02-01 |
+------------+
2 rows in set (0.00 sec)

mysql> SELECT ymd FROM foo1
    -> UNION ALL
    -> SELECT ymd FROM foo2;
+------------+
| ymd        |
+------------+
| 2018-01-01 |
| 2018-02-01 |
+------------+
2 rows in set (0.00 sec)
YukiJikumaru commented 6 years ago

I found a workaround. If I use derived table, this error is not raised.

šŸ†–

(SELECT ymd FROM foo1)
UNION ALL
(SELECT ymd FROM foo2);

šŸ†—

SELECT ymd FROM (
  (SELECT ymd FROM foo1)
  UNION ALL
  (SELECT ymd FROM foo2)
) bar;

SELECT ymd FROM (
  SELECT ymd FROM foo1
  UNION ALL
  SELECT ymd FROM foo2
) bar;

SELECT ymd FROM (
  SELECT foo1.ymd FROM foo1
  UNION ALL
  SELECT foo2.ymd FROM foo2
) bar;

But very strange.

YukiJikumaru commented 6 years ago

Test Code

[jiku@centos7 mysql2]$ git diff  spec/mysql2/statement_spec.rb 
diff --git a/spec/mysql2/statement_spec.rb b/spec/mysql2/statement_spec.rb
index dbc185e..2abd1d3 100644
--- a/spec/mysql2/statement_spec.rb
+++ b/spec/mysql2/statement_spec.rb
@@ -153,6 +153,18 @@ RSpec.describe Mysql2::Statement do
     expect(result.first.first[1]).to be_an_instance_of(Time)
   end

+  # This test will fail with ArgumentError
+  it "should select dates with UNION" do
+    statement = @client.prepare "SELECT STR_TO_DATE('2018-01-01', '%Y-%m-%d') UNION ALL SELECT STR_TO_DATE('2018-01-01', '%Y-%m-%d')"
+    result = statement.execute
+    expect(result.first.first[1]).to be_an_instance_of(Date)
+  end
+
+  # This test will pass
+  it "should select dates with UNION and desribed table" do
+    statement = @client.prepare "SELECT ymd FROM (SELECT STR_TO_DATE('2018-01-01', '%Y-%m-%d') AS ymd UNION ALL SELECT STR_TO_DATE('2018-01-01', '%Y-%m-%d') AS ymd) deribed"
+    result = statement.execute
+    expect(result.first.first[1]).to be_an_instance_of(Date)
+  end
+
   it "should prepare Date values" do
     now = Date.today
     statement = @client.prepare('SELECT ? AS a')

Call Stack

1. ext/mysql2/statement.c rb_mysql_stmt_execute

static VALUE rb_mysql_stmt_execute(int argc, VALUE *argv, VALUE self) {

  ..........

  if (!is_streaming) {
    // cache all result
    rb_funcall(resultObj, intern_each, 0); // GOTO 2
  }

  return resultObj;
}

2. ext/mysql2/result.c rb_mysql_result_each

static VALUE rb_mysql_result_each(int argc, VALUE * argv, VALUE self) {
  ..........
  if (wrapper->stmt_wrapper) {
    fetch_row_func = rb_mysql_result_fetch_row_stmt;
  } else {
    fetch_row_func = rb_mysql_result_fetch_row;
  }

  return rb_mysql_result_each_(self, fetch_row_func, &args);  // GOTO 3
}

3. ext/mysql2/result.c rb_mysql_resulteach

static VALUE rb_mysql_result_each_(VALUE self,
                                   VALUE(*fetch_row_func)(VALUE, MYSQL_FIELD *fields, const result_each_args *args),
                                   const result_each_args *args)
{
  ..........
  if (wrapper->is_streaming) {
    ..........
  } else {
    if (args->cacheRows && wrapper->lastRowProcessed == wrapper->numberOfRows) { // If Use deribed table this condition is true
      ..........
    } else { // If no deribed table wrapper->lastRowProcessed == wrapper->numberOfRows is false
      ..........
      for (i = 0; i < wrapper->numberOfRows; i++) {
        VALUE row;
        if (args->cacheRows && i < rowsProcessed) {
          row = rb_ary_entry(wrapper->rows, i);
        } else {
          row = fetch_row_func(self, fields, args); // GOTO 4
  ..........

When no deribed table, If condition are like below.

args->cacheRows = 1
wrapper->lastRowProcessed = 0
(unsigned long)wrapper->numberOfRows = 2

4. ext/mysql2/result.c rb_mysql_result_fetch_row_stmt

static VALUE rb_mysql_result_fetch_row_stmt(VALUE self, MYSQL_FIELD * fields, const result_each_args *args)
{
  ..........
        case MYSQL_TYPE_DATE:         // MYSQL_TIME
        case MYSQL_TYPE_NEWDATE:      // MYSQL_TIME
          val = rb_funcall(cDate, intern_new, 3, INT2NUM(ts->year), INT2NUM(ts->month), INT2NUM(ts->day)); // ERROR!!!

Because ts-> year, ys->month, ts-> day are invalid.

ts->year = 16844770
ts->month = 0
ts->day = 0
ts->hour = 0
ts->minute = 0
ts->second = 0
ts->neg = 0
ts->second_part = 0
sodabrew commented 6 years ago

Phenomenal detective work! Are you sure you have not returned from the future with debugging superpowers? :)

... but, why the weird values? I wonder how MySQL is handing back the data if it's falling under the MYSQL_TYPE_NEWDATE case but actually something else?