twomice / com.joineryhq.jsumfields

Provides additional fields under the Summary Fields framework.
Other
1 stars 10 forks source link

avoid divide by zero error #8

Open jmcclelland opened 4 years ago

jmcclelland commented 4 years ago

Depending on your version of Mysql/Mariadb your "sql_mode" setting may be set to ERROR_FOR_DIVISION_BY_ZERO - in which case any query that has a zero as the denominator is going to throw a CiviCRM back trace. It's even worse because the error log doesn't show that it's from a trigger so it can be really confusing.

I wrestled with this bug over in sumfields and figure out a fix. I'll submit a PR for jsumfields using the same approach.

jmcclelland commented 4 years ago

Hm. Well maybe not so easy. My trick of coalescing to 1 instead of 0 doesn't work. Here's the query that is failing:

MariaDB [ptp]> INSERT INTO civicrm_value_summary_fields_51 (entity_id, open_rate_rate_all_time_609)        
    -> 
    -> SELECT t.contact_id, t.rate FROM (           
    ->   SELECT s.contact_id, ROUND(coalesce(coalesce(o.opened, 0) / (coalesce(s.sent, 1) - coalesce(b.bounced, 0)), 0) * 100, 2) as rate
    ->     FROM (
    ->       SELECT q.contact_id, count(*) as sent
    ->         FROM civicrm_mailing_event_queue q
    ->           INNER JOIN civicrm_mailing_event_delivered d ON d.event_queue_id = q.id
    ->         WHERE 1
    ->         GROUP BY q.contact_id
    ->     ) s
    ->     LEFT JOIN (                          
    ->      SELECT q.contact_id, count(*) as opened
    ->        FROM civicrm_mailing_event_queue q
    ->          INNER JOIN civicrm_mailing_event_opened o ON o.event_queue_id = q.id
    ->        WHERE 1
    ->        GROUP BY q.contact_id
    ->     ) o ON o.contact_id = s.contact_id
    ->     LEFT JOIN (                          
    ->       SELECT q.contact_id, count(*) as bounced
    ->       FROM civicrm_mailing_event_queue q
    ->         INNER JOIN civicrm_mailing_event_bounce b ON b.event_queue_id = q.id
    ->       WHERE 1
    ->       GROUP BY q.contact_id
    ->     ) b ON b.contact_id = s.contact_id
    ->   ) t
    ->   ON DUPLICATE KEY UPDATE open_rate_rate_all_time_609 = t.rate;
ERROR 1365 (22012): Division by 0
MariaDB [ptp]>
twomice commented 4 years ago

Thanks for reporting this @jmcclelland , and for pointing to an idea for a fix.

Could you tell me an example situation in which this error would appear? I know we would need to have ERROR_FOR_DIVISION_BY_ZERO, but in terms of mailings, what data would cause this error to appear?

jmcclelland commented 4 years ago

For me, that failing query happened when I tried to save new settings - so it's not the sql trigger, but the initial sql statement that re-calculates all the records. It must be hitting a user with no mailings - so the total sent is zero.

twomice commented 4 years ago

Okay, thank you, @jmcclelland . I'll have to look into this further.

petednz commented 4 years ago

Any timetable for looking in to this? We hit same when checking this extension out.

twomice commented 4 years ago

We're looking at it this week, @petednz . Thanks for the nudge. First step is to reproduce it. Can you tell more about a) where you see this error (civicrm log files? on screen? etc.) and b) specific steps you take to produce the error and/or avoid it (e.g., only with specific settings on the Summary Fields form; only for specific fields such as open_rate_rate_all_time or others; etc.) Thanks!

petednz commented 4 years ago

ours was a d8 site in case that gives you a 'of course' moment. will get jitendra to supply more details

twomice commented 4 years ago

@petednz and @jmcclelland : From the OP, it seems this error shoiuld only appear when there's a 0 in the denominator. Can you please run this SELECT query and tell me whether it returns any rows, or whether it throws a 'divide by zero' error ? (Those rows, if returned, would be the one with a zero denominator.)

SELECT t.contact_id, t.rate, t.denom FROM (           
SELECT (coalesce(s.sent, 1) - coalesce(b.bounced, 0)) as denom, s.contact_id, ROUND(coalesce(coalesce(o.opened, 0) / (coalesce(s.sent, 1) - coalesce(b.bounced, 0)), 0) * 100, 2) as rate
  FROM (
    SELECT q.contact_id, count(*) as sent
      FROM civicrm_mailing_event_queue q
        INNER JOIN civicrm_mailing_event_delivered d ON d.event_queue_id = q.id
      WHERE 1
      GROUP BY q.contact_id
  ) s
  LEFT JOIN (                          
   SELECT q.contact_id, count(*) as opened
     FROM civicrm_mailing_event_queue q
       INNER JOIN civicrm_mailing_event_opened o ON o.event_queue_id = q.id
     WHERE 1
     GROUP BY q.contact_id
  ) o ON o.contact_id = s.contact_id
  LEFT JOIN (                          
    SELECT q.contact_id, count(*) as bounced
    FROM civicrm_mailing_event_queue q
      INNER JOIN civicrm_mailing_event_bounce b ON b.event_queue_id = q.id
    WHERE 1
    GROUP BY q.contact_id
  ) b ON b.contact_id = s.contact_id
) t
WHERE 
t.denom = 0
jmcclelland commented 4 years ago

This is really hard to replicate due to subtle differences in how mysql and mariadb handle divide by zero errors and also how CiviCRM reponds to database warnings.

I ran the query in my mariadb cli and got proper results, but also a warning. Here's the end of the query;


|      37036 | 0.00 |     0 |
|      37049 | 0.00 |     0 |
|      37117 | 0.00 |     0 |
|      37409 | 0.00 |     0 |
|      37489 | 0.00 |     0 |
+------------+------+-------+
2164 rows in set, 2164 warnings (2.153 sec)

And here is how I can see if my database is configured to show these warnings via the ERROR_FOR_DIVISION_BY_ZERO (MariaDB):

MariaDB [ptp]> SHOW VARIABLES LIKE 'sql_mode';
+---------------+-------------------------------------------------------------------------------------------+
| Variable_name | Value                                                                                     |
+---------------+-------------------------------------------------------------------------------------------+
| sql_mode      | STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION |
+---------------+-------------------------------------------------------------------------------------------+
1 row in set (0.002 sec)

MariaDB [ptp]>

And here's a simple way to replicate the warning:

MariaDB [ptp]> select 4 / 0;
+-------+
| 4 / 0 |
+-------+
|  NULL |
+-------+
1 row in set, 1 warning (0.001 sec)

MariaDB [ptp]>

However, here I got confused. I thought this would throw a back trace on me and demonstrate that my instace can't tolerate division by zero:

www-data@333f090f552b:~/powerbase$ cv php:eval 'CRM_Core_DAO::executeQuery("SELECT 4/0")'

www-data@333f090f552b:~/powerbase$ 

I am not sure why there was no backtrace. I'm definitely running a different version of MariaDB now then I was back in January when I reported the error (10.3.23 now, possibly 10.1.45 when I first encountered the error).

I also was running CiviCRM 5.13.x when I encountered the error, now running 5.27.x.

So, MariaDB's handling of division by zero may have changed and also CiviCRM's handling of database warrnings may have changed.

Sheesh, you asked for help and I just blurted out a lot of confusion instead. Sorry! Hope there is something here that might be helpful.

twomice commented 4 years ago

Thanks @jmcclelland That's quite helpful. We're still looking.

twomice commented 4 years ago

@petednz we're slowly making progress on this. When you have a moment, can you please tell me what your system does in response to this command:

$ cv php:eval 'CRM_Core_DAO::executeQuery("SELECT 4/0")'

I'm hoping for a backtrace, since that would mimic the 'divide by zero' behavior mentioned in the OP.

jitendrapurohit commented 4 years ago

The above PHP code returns NULL on our side.

$ cv php:eval 'CRM_Core_DAO::executeQuery("SELECT 4/0")' -v

The failed query is while inserting rows to the custom table -

INSERT INTO civicrm_value_summary_field_28 (entity_id, open_rate_rate_all_time_189)
SELECT t.contact_id, t.rate
    FROM
      (
      SELECT
        s.contact_id, ROUND(coalesce(coalesce(o.opened, 0) / (coalesce(s.sent, 0) - coalesce(b.bounced, 0)), 0) * 100, 2) as rate
      FROM
      (
        -- total mailings sent to contact
        SELECT q.contact_id, count(*) as sent
        FROM
          civicrm_mailing_event_queue q
          INNER JOIN civicrm_mailing_event_delivered d ON d.event_queue_id = q.id
        WHERE
          1
        GROUP BY
          q.contact_id
      ) s
      LEFT JOIN (
        -- total mailings opened
        SELECT q.contact_id, count(*) as opened
        FROM
          civicrm_mailing_event_queue q
          INNER JOIN civicrm_mailing_event_opened o ON o.event_queue_id = q.id
        WHERE
          1
        GROUP BY
          q.contact_id
      ) o ON o.contact_id = s.contact_id
      LEFT JOIN (
        -- total mailings bounced
        SELECT q.contact_id, count(*) as bounced
        FROM
          civicrm_mailing_event_queue q
          INNER JOIN civicrm_mailing_event_bounce b ON b.event_queue_id = q.id
        WHERE
          1
        GROUP BY
          q.contact_id
      ) b ON b.contact_id = s.contact_id
    )  t
    ON DUPLICATE KEY UPDATE open_rate_rate_all_time_189 = t.rate;

The complete backtrace is -

[debug] $backTrace = #0 /var/www/civicrm-master/sites/all/modules/civicrm/CRM/Core/Error.php(937): CRM_Core_Error::backtrace("backTrace", TRUE)
#1 /var/www/civicrm-master/sites/all/modules/civicrm/vendor/pear/pear-core-minimal/src/PEAR.php(922): CRM_Core_Error::exceptionHandler(Object(DB_Error))
#2 /var/www/civicrm-master/sites/all/modules/civicrm/packages/DB.php(997): PEAR_Error->__construct("DB Error: division by zero", -13, 16, (Array:2), "\n      INSERT INTO civicrm_value_summary_field_28 (entity_id, open_rate_rate...")
#3 /var/www/civicrm-master/sites/all/modules/civicrm/vendor/pear/pear-core-minimal/src/PEAR.php(575): DB_Error->__construct(-13, 16, (Array:2), "\n      INSERT INTO civicrm_value_summary_field_28 (entity_id, open_rate_rate...")
#4 /var/www/civicrm-master/sites/all/modules/civicrm/vendor/pear/pear-core-minimal/src/PEAR.php(223): PEAR->_raiseError(Object(DB_mysqli), NULL, -13, 16, (Array:2), "\n      INSERT INTO civicrm_value_summary_field_28 (entity_id, open_rate_rate...", "DB_Error", TRUE)
#5 /var/www/civicrm-master/sites/all/modules/civicrm/packages/DB/common.php(1925): PEAR->__call("raiseError", (Array:7))
#6 /var/www/civicrm-master/sites/all/modules/civicrm/packages/DB/mysqli.php(936): DB_common->raiseError(-13, NULL, NULL, "\n      INSERT INTO civicrm_value_summary_field_28 (entity_id, open_rate_rate...", "1365 ** Division by 0")
#7 /var/www/civicrm-master/sites/all/modules/civicrm/packages/DB/mysqli.php(406): DB_mysqli->mysqliRaiseError()
#8 /var/www/civicrm-master/sites/all/modules/civicrm/packages/DB/common.php(1231): DB_mysqli->simpleQuery("\n      INSERT INTO civicrm_value_summary_field_28 (entity_id, open_rate_rate...")
#9 /var/www/civicrm-master/sites/all/modules/civicrm/packages/DB/DataObject.php(2696): DB_common->query("\n      INSERT INTO civicrm_value_summary_field_28 (entity_id, open_rate_rate...")
#10 /var/www/civicrm-master/sites/all/modules/civicrm/packages/DB/DataObject.php(1829): DB_DataObject->_query("\n      INSERT INTO civicrm_value_summary_field_28 (entity_id, open_rate_rate...")
#11 /var/www/civicrm-master/sites/all/modules/civicrm/CRM/Core/DAO.php(439): DB_DataObject->query("\n      INSERT INTO civicrm_value_summary_field_28 (entity_id, open_rate_rate...")
#12 /var/www/civicrm-master/sites/all/modules/civicrm/CRM/Core/DAO.php(1528): CRM_Core_DAO->query("\n      INSERT INTO civicrm_value_summary_field_28 (entity_id, open_rate_rate...", TRUE)
#13 /var/www/civicrm-master/sites/default/files/civicrm/ext/com.joineryhq.jsumfields/jsumfields.php(2180): CRM_Core_DAO::executeQuery("\n      INSERT INTO civicrm_value_summary_field_28 (entity_id, open_rate_rate...")
#14 /var/www/civicrm-master/sites/default/files/civicrm/ext/com.joineryhq.jsumfields/CRM/Jsumfields/APIWrapperSumfieldsGendata.php(15): _jsumfields_generate_data_based_on_current_data()
#15 /var/www/civicrm-master/sites/all/modules/civicrm/Civi/API/Subscriber/WrapperAdapter.php(72): CRM_Jsumfields_APIWrapperSumfieldsGendata->toApiOutput((Array:8), (Array:5))
#16 /var/www/civicrm-master/sites/all/modules/civicrm/vendor/symfony/event-dispatcher/EventDispatcher.php(214): Civi\API\Subscriber\WrapperAdapter->onApiRespond(Object(Civi\API\Event\RespondEvent), "civi.api.respond", Object(Civi\Core\CiviEventDispatcher))
#17 /var/www/civicrm-master/sites/all/modules/civicrm/vendor/symfony/event-dispatcher/EventDispatcher.php(44): Symfony\Component\EventDispatcher\EventDispatcher->doDispatch((Array:5), "civi.api.respond", Object(Civi\API\Event\RespondEvent))
#18 /var/www/civicrm-master/sites/all/modules/civicrm/Civi/Core/CiviEventDispatcher.php(126): Symfony\Component\EventDispatcher\EventDispatcher->dispatch("civi.api.respond", Object(Civi\API\Event\RespondEvent))
#19 /var/www/civicrm-master/sites/all/modules/civicrm/Civi/API/Kernel.php(256): Civi\Core\CiviEventDispatcher->dispatch("civi.api.respond", Object(Civi\API\Event\RespondEvent))
#20 /var/www/civicrm-master/sites/all/modules/civicrm/Civi/API/Kernel.php(152): Civi\API\Kernel->respond(Object(Civi\API\Provider\MagicFunctionProvider), (Array:8), (Array:5))
#21 /var/www/civicrm-master/sites/all/modules/civicrm/Civi/API/Kernel.php(81): Civi\API\Kernel->runRequest((Array:8))
#22 /var/www/civicrm-master/sites/all/modules/civicrm/api/api.php(22): Civi\API\Kernel->runSafe("SumFields", "Gendata", (Array:2))
#23 /var/www/civicrm-master/sites/all/modules/civicrm/CRM/Core/JobManager.php(133): civicrm_api("SumFields", "Gendata", (Array:2))
#24 /var/www/civicrm-master/sites/all/modules/civicrm/CRM/Core/JobManager.php(100): CRM_Core_JobManager->executeJob(Object(CRM_Core_ScheduledJob))
#25 /var/www/civicrm-master/sites/all/modules/civicrm/CRM/Admin/Form/Job.php(195): CRM_Core_JobManager->executeJobById(26)
#26 /var/www/civicrm-master/sites/all/modules/civicrm/CRM/Core/Form.php(504): CRM_Admin_Form_Job->postProcess()
#27 /var/www/civicrm-master/sites/all/modules/civicrm/CRM/Core/QuickForm/Action/Submit.php(58): CRM_Core_Form->mainProcess()
#28 /var/www/civicrm-master/sites/all/modules/civicrm/packages/HTML/QuickForm/Controller.php(203): CRM_Core_QuickForm_Action_Submit->perform(Object(CRM_Admin_Form_Job), "submit")
#29 /var/www/civicrm-master/sites/all/modules/civicrm/packages/HTML/QuickForm/Page.php(103): HTML_QuickForm_Controller->handle(Object(CRM_Admin_Form_Job), "submit")
#30 /var/www/civicrm-master/sites/all/modules/civicrm/CRM/Core/Controller.php(347): HTML_QuickForm_Page->handle("submit")
#31 /var/www/civicrm-master/sites/all/modules/civicrm/CRM/Core/Page/Basic.php(395): CRM_Core_Controller->run()
#32 /var/www/civicrm-master/sites/all/modules/civicrm/CRM/Core/Page/Basic.php(140): CRM_Core_Page_Basic->edit(4, 26)
#33 /var/www/civicrm-master/sites/all/modules/civicrm/CRM/Admin/Page/Job.php(134): CRM_Core_Page_Basic->run()
#34 /var/www/civicrm-master/sites/all/modules/civicrm/CRM/Core/Invoke.php(312): CRM_Admin_Page_Job->run((Array:3), NULL)
#35 /var/www/civicrm-master/sites/all/modules/civicrm/CRM/Core/Invoke.php(68): CRM_Core_Invoke::runItem((Array:15))
#36 /var/www/civicrm-master/sites/all/modules/civicrm/CRM/Core/Invoke.php(36): CRM_Core_Invoke::_invoke((Array:3))
#37 /var/www/civicrm-master/sites/all/modules/civicrm/drupal/civicrm.module(454): CRM_Core_Invoke::invoke((Array:3))
#38 /var/www/civicrm-master/includes/menu.inc(527): civicrm_invoke("admin", "job")
#39 /var/www/civicrm-master/index.php(21): menu_execute_active_handler()

which points to query written for mail_openrate_alltime field -

If the ROUND statement in the query added to calculate rate (line 202) is updated from

ROUND(coalesce(coalesce(o.opened, 0) / (coalesce(s.sent, 0) - coalesce(b.bounced, 0)), 0) * 100, 2) as rate

to

ROUND(coalesce(coalesce(o.opened, 0) / NULLIF(coalesce(s.sent, 0) - coalesce(b.bounced, 0), 0), 0) * 100, 2) as rate

fixes the error and also avoids the 0 in the denominator? Does this make sense?

jitendrapurohit commented 4 years ago

Ah, looks like there are similar ROUND statements in other 9-10 places as well. So the above replacement might be needed in all those places to completely avoid the error in the job execution.

twomice commented 4 years ago

Thanks @jitendrapurohit. My notes:

  1. It's not the ROUND() that is the problem; rather it's division by zero, which we should be guarding against in any case where we're dividing by a calculated value, since that calculated value can sometimes turn out to be zero.
  2. There are indeed several queries where we're dividing by a calculated value.
  3. The NULLIF in jitendrapurohit's comment works because it returns NULL where the two values are equal, i.e., where subtracting one from the other would equal zero; this means the denominator is NULL instead of zero, and dividing by NULL will simply return NULL without error or warning; and then, because the entire division returns NULL, and because it's used inside COALESCE, that NULL value is skipped and COALESCE uses the next non-null value, which in this case is zero, which is a valid value for any of these 'bounce rate' or 'open rate'.
twomice commented 4 years ago

This should be fixed now, as of https://github.com/twomice/com.joineryhq.jsumfields/pull/13/commits/6ea45a43c862acadaa6b4a18bb1ed9fb6e2223a9. Thanks @jmcclelland , @jitendrapurohit and @lucky091588 ! (If someone tries this on master, please confirm here in the ticket that it works for you. Leaving this open for a while, pending community review.)

JoeMurray commented 3 years ago

Nice to see the cooperative community spirit at work here everyone. Cheers, Joe