`begin_work` retval is sometimes incorrect if connection is `RaiseError => 0`

noelcragg commented 2 years ago

Consider the following order of events:

the DBI::connect is invoked with RaiseError => 0 and returns $handle
the connection to the backend gets interrupted for some reason
calling $handle->begin_work currently always returns true

This is a weird but not uncommon case. For example, in a high-volume environment, one may have database connections being handled by a proxy or load-balancer. Software performing the latter functions sometimes crashes; hardware performing the latter functions sometimes locks up or reboots.

If begin_work notices an error performing its duties, it should return false. For some backends there's no error to notice, because begin_work is just setting the values of AutoCommit and BeginWork in memory. For others, modifying the value of AutoCommit may result in communication with the back-end database, during which errors happen.

Here's a minimal script to demonstrate the problem on u*x talking to a local MySQL db:

#!/usr/bin/perl                                                                                

use 5.14.0;
use warnings;

use DBI;
use POSIX;

my $dbh = DBI->connect('dbi:mysql:host=127.0.0.1;database=test', 'root', '',
                       { AutoCommit => 1, mysql_auto_reconnect => 0,
                         RaiseError => 0, PrintError => 1 })
  or die $DBI::errstr;
POSIX::close(3);                # or whatever fd it is on your system                          
$dbh->begin_work and say 'begin_work ok' or die 'begin_work: ' . ($dbh->errstr // 'undef');
$dbh->err and say '$dbh->err is ', $dbh->err;
$dbh->errstr and say '$dbh->errstr is ', $dbh->errstr;
$dbh->do('do 0') and say 'do ok'         or die         'do: ' . ($dbh->errstr // 'undef');
$dbh->commit     and say 'commit ok'     or die     'commit: ' . ($dbh->errstr // 'undef');
$dbh->disconnect and say 'disconnect ok' or die 'disconnect: ' . ($dbh->errstr // 'undef');

To get this to run on your own machine, you might have to strace the process and figure out what fd is getting opened if it's not 3 (as it has been on the various machines i've run this). Closing the fd simulates the connection failure I was describing above.

When run, you'll see:

DBD::mysql::db begin_work failed: Turning off AutoCommit failed at ./db-toy2-sb.pl line 14.
begin_work ok
$dbh->err is 21
$dbh->errstr is Turning off AutoCommit failed
DBD::mysql::db do failed: MySQL server has gone away at ./db-toy2-sb.pl line 17.
do: MySQL server has gone away at ./db-toy2-sb.pl line 17.

Note that we see "begin_work ok" in the output. This could only be displayed if $dbh->begin_work returned a true value. Because it does not, we continue and end up invoking the do method which does correctly detect that the connection has gone away and returns a false value.

I've worked around this issue in my own code by checking both the retval from begin_work and the err method of the handle. It would be nice not to have to do this.

pali commented 1 year ago

Hello!

I have looked at this issue. The problem is that $dbh and $sth tied hash attribute STORE method does not signal failure via return value when AutoCommit is supported by driver but changing its attribute failed (e.g. because connection was closed). When RaiseError is enabled then STORE method propagates error via raising exception.

My proposed solution is to extend dbd_db_STORE_attrib() and dbd_st_STORE_attrib() APIs to returns negative value on failure when AutoCommit attribute (or any other) is supported but changing its value failed. And then in begin_work() method check also return value of STORE method and propagates error.

This means that both DBI module and driver code needs to be changed for fixing this issue.

My proposed diffs are below for DBI module and DBD-MariaDB driver. Same would have to be done in every other driver affected by this issue. Unfortunately I do not see easy solution without changing drivers because AutoCommit is somehow already special and for C/XS drivers dbd_db_STORE_attrib() needs to return non-zero value for AutoCommit, which currently means no error.

Please let me know what do you think.

DBI code diff:

diff --git a/DBI.pm b/DBI.pm
index 96124838ce3a..9dde79a5ca05 100644
--- a/DBI.pm
+++ b/DBI.pm
@@ -1747,7 +1747,7 @@ sub _new_sth {    # called by DBD::<drivername>::db::prepare)
    my $dbh = shift;
    return $dbh->set_err($DBI::stderr, "Already in a transaction")
        unless $dbh->FETCH('AutoCommit');
-   $dbh->STORE('AutoCommit', 0); # will croak if driver doesn't support it
+   $dbh->STORE('AutoCommit', 0) or return; # will croak or return false if driver doesn't support it or setting failed
    $dbh->STORE('BegunWork',  1); # trigger post commit/rollback action
    return 1;
     }
diff --git a/Driver.xst b/Driver.xst
index c3981f130689..77508db49178 100644
--- a/Driver.xst
+++ b/Driver.xst
@@ -355,19 +355,22 @@ disconnect(dbh)
     RETVAL

-void
+bool
 STORE(dbh, keysv, valuesv)
     SV *        dbh
     SV *        keysv
     SV *        valuesv
     CODE:
     D_imp_dbh(dbh);
+    int ret;
     if (SvGMAGICAL(valuesv))
         mg_get(valuesv);
-    ST(0) = &PL_sv_yes;
-    if (!dbd_db_STORE_attrib(dbh, imp_dbh, keysv, valuesv))
-        if (!DBIc_DBISTATE(imp_dbh)->set_attr(dbh, keysv, valuesv))
-            ST(0) = &PL_sv_no;
+    ret = dbd_db_STORE_attrib(dbh, imp_dbh, keysv, valuesv);
+    if (ret == 0)
+        ret = DBIc_DBISTATE(imp_dbh)->set_attr(dbh, keysv, valuesv);
+    RETVAL = (ret > 0);
+    OUTPUT:
+    RETVAL

 void
@@ -779,19 +782,22 @@ blob_read(sth, field, offset, len, destrv=Nullsv, destoffset=0)
     }

-void
+bool
 STORE(sth, keysv, valuesv)
     SV *        sth
     SV *        keysv
     SV *        valuesv
     CODE:
     D_imp_sth(sth);
+    int ret;
     if (SvGMAGICAL(valuesv))
         mg_get(valuesv);
-    ST(0) = &PL_sv_yes;
-    if (!dbd_st_STORE_attrib(sth, imp_sth, keysv, valuesv))
-        if (!DBIc_DBISTATE(imp_sth)->set_attr(sth, keysv, valuesv))
-            ST(0) = &PL_sv_no;
+    ret = dbd_st_STORE_attrib(sth, imp_sth, keysv, valuesv);
+    if (ret == 0)
+        ret = DBIc_DBISTATE(imp_sth)->set_attr(sth, keysv, valuesv);
+    RETVAL = (ret > 0);
+    OUTPUT:
+    RETVAL

 # FETCH renamed and ALIAS'd to avoid case clash on VMS :-(

DBD-MariaDB diff:

diff --git a/dbdimp.c b/dbdimp.c
index ca4a6c0ede04..319cb85fabd7 100644
--- a/dbdimp.c
+++ b/dbdimp.c
@@ -3230,6 +3230,8 @@ mariadb_db_STORE_attrib(
   if (!imp_dbh->pmysql && !mariadb_db_reconnect(dbh, NULL))
   {
     mariadb_dr_do_error(dbh, CR_SERVER_GONE_ERROR, "MySQL server has gone away", "HY000");
+    if (memEQs(key, kl, "AutoCommit"))
+      return -1; /* -1 means we handled it as failed - important to avoid spurious errors */
     return 0;
   }

@@ -3248,7 +3250,7 @@ mariadb_db_STORE_attrib(
            )
         {
           mariadb_dr_do_error(dbh, mysql_errno(imp_dbh->pmysql), mysql_error(imp_dbh->pmysql), mysql_sqlstate(imp_dbh->pmysql));
-          return 1;  /* 1 means we handled it - important to avoid spurious errors */
+          return -1; /* -1 means we handled it as failed - important to avoid spurious errors */
         }
       }
       DBIc_set(imp_dbh, DBIcf_AutoCommit, bool_value);

noelcragg commented 1 year ago

This seems like a reasonable solution given the constraints. Thank you.

pali commented 1 year ago

Hm... I have read available DBI documentation again and also code for handling AutoCommit in DBI and I now think that this is not the correct solution.

Important parts are:

https://metacpan.org/pod/DBI#AutoCommit Attempting to set AutoCommit to an unsupported value is a fatal error. This is an important feature of the DBI. Applications that need full transaction behaviour can set $dbh->{AutoCommit} = 0 (or set AutoCommit to 0 via "connect") without having to check that the value was assigned successfully.
https://metacpan.org/pod/DBI::DBD#The-dbd_db_STORE_attrib-method If you are handling an attribute and something fails, you should ... but there are examples where even the DBI specification expects that you croak(). (See the AutoCommit method in DBI.)
DBI dbih_set_attr_k() function for "AutoCommit": https://github.com/perl5-dbi/dbi/blob/3803847b1e9b27ce8b9fd9647014c74e4c5430d5/DBI.xs#L2215-L2221
DBI begin_work() method: https://github.com/perl5-dbi/dbi/blob/3803847b1e9b27ce8b9fd9647014c74e4c5430d5/DBI.pm#L1750

If DBI driver does not support "AutoCommit" at all then trying to set $dbh->{AutoCommit} always throws exception (croak/die), also when RaiseError is explicitly disabled. This is mentioned in DBI AutoCommit documentation (quoted above; "fatal error"). Also in developer DBI::DBD documentation is mentioned that if DBI driver supports and handles AutoCommit then it should croak when setting AutoCommit attribute fails.

Therefore in my opinion the correct solution is just to fix DBD driver to properly die/croak when changing AutoCommit fails even when RaiseError is not enabled. And let DBI code unchanged.

Any opinion on this? With this change it means that $dbh->begin_work() will die on AutoCommit also when RaiseError is disabled.

DBD-MariaDB diff:

diff --git a/dbdimp.c b/dbdimp.c
index ca4a6c0ede04..777860f61c87 100644
--- a/dbdimp.c
+++ b/dbdimp.c
@@ -3214,6 +3214,8 @@ void mariadb_db_destroy(SV* dbh, imp_dbh_t* imp_dbh) {
  *  Returns: 1 for success, 0 otherwise
  *
  **************************************************************************/
+/* DBI expects that on AutoCommit failure driver croaks and does not return */
+#define croak_autocommit_failure() croak("Changing AutoCommit attribute failed: %" SVf, SVfARG(DBIc_ERRSTR(imp_dbh)))
 int
 mariadb_db_STORE_attrib(
                     SV* dbh,
@@ -3230,6 +3232,8 @@ mariadb_db_STORE_attrib(
   if (!imp_dbh->pmysql && !mariadb_db_reconnect(dbh, NULL))
   {
     mariadb_dr_do_error(dbh, CR_SERVER_GONE_ERROR, "MySQL server has gone away", "HY000");
+    if (memEQs(key, kl, "AutoCommit"))
+      croak_autocommit_failure(); /* does not return */
     return 0;
   }

@@ -3248,7 +3252,7 @@ mariadb_db_STORE_attrib(
            )
         {
           mariadb_dr_do_error(dbh, mysql_errno(imp_dbh->pmysql), mysql_error(imp_dbh->pmysql), mysql_sqlstate(imp_dbh->pmysql));
-          return 1;  /* 1 means we handled it - important to avoid spurious errors */
+          croak_autocommit_failure(); /* does not return */
         }
       }
       DBIc_set(imp_dbh, DBIcf_AutoCommit, bool_value);

timbunce commented 1 year ago

Thank you @pali for your thoughtful investigation and analysis.

Therefore in my opinion the correct solution is just to fix DBD driver to properly die/croak when changing AutoCommit fails even when RaiseError is not enabled. And let DBI code unchanged.

I agree.

noelcragg commented 1 year ago

Thanks for the detailed consideration.

I have a few thoughts in response.

The documentation states explicitly that trying to set "AutoCommit" when the database doesn't support it is a fatal error. I don't think that implies that a set operation that fails should invoke a fatal error. I believe the the docs underscore this, saying (emphasis mine):

Attempting to set "AutoCommit" to an unsupported value is a fatal error. This is an important feature of the DBI. Applications that need full transaction behaviour can set "$dbh->{AutoCommit} = 0" (or set "AutoCommit" to 0 via "connect") without having to check that the value was assigned successfully.

The proposed change would invalidate that last part.

I could not find the documentation matching @pali's assertion:

Also in developer DBI::DBD documentation is mentioned that if DBI driver supports and handles AutoCommit then it should croak when setting AutoCommit attribute fails.

The part of the DBI::DBD docs which document the dbd_db_STORE_attrib method say:

The return value is TRUE if you have handled the attribute or FALSE otherwise. If you are handling an attribute and something fails, you should call "dbd_drv_error()", so DBI can raise exceptions, if desired. If "dbd_drv_error()" returns, however, you have a problem: the user will never know about the error, because he typically will not check "$dbh->errstr()".

I cannot recommend a general way of going on, if "dbd_drv_error()" returns, but there are examples where even the DBI specification expects that you "croak()". (See the AutoCommit method in DBI.)

Other than saying one should invoke dbd_drv_error(), this isn't very proscriptive as far as calling croak() or not. If we should call croak() for these errors, shouldn't the documentation tell us to do so?

In the end I guess I'm just looking for agreement between the code and documentation.

Alternately, would the following simple change to DBI::begin_work do the right thing for both settings of RaiseError?

sub begin_work {
    my $dbh = shift;
    return $dbh->set_err($DBI::stderr, "Already in a transaction")
      unless $dbh->FETCH('AutoCommit');
    $dbh->STORE('AutoCommit', 0); # will croak if driver doesn't support it
    $dbh->err and return; # <--- return failure if we see an error has happened
    $dbh->STORE('BegunWork',  1); # trigger post commit/rollback action
    return 1;
}

pali commented 1 year ago

Other than saying one should invoke dbd_drv_error(), this isn't very proscriptive as far as calling croak() or not. If we should call croak() for these errors, shouldn't the documentation tell us to do so?

My understanding is that: if setting attribute fails then (fictional) dbd_drv_error() should be called. And if dbd_drv_error() returns (which means it did not throw exception) then in some cases DBI expects that driver croaks (throw exception), which is for example AutoCommit.

And about this:

    $dbh->STORE('AutoCommit', 0); # will croak if driver doesn't support it
    $dbh->err and return; # <--- return failure if we see an error has happened

It would not work because error may be set before ->STORE is called.

noelcragg commented 1 year ago

Thanks for the explanation.

pali commented 1 year ago

Fix for DBD::MariaDB with test case is here: https://github.com/perl5-dbi/DBD-MariaDB/pull/185

pali commented 1 year ago

@noelcragg: Could you test the fix?

pali commented 1 year ago

Fix for DBD::MariaDB was merged.

Tux commented 3 months ago

@pali is work in DBI still required?

perl5-dbi / dbi

`begin_work` retval is sometimes incorrect if connection is `RaiseError => 0` #104