veda-consulting-company / uk.co.vedaconsulting.mailchimp

Other
22 stars 43 forks source link

Performance Issues when syncing Civi to Mailchimp #297

Open ghost opened 6 years ago

ghost commented 6 years ago

On large lists (10k+ contacts), the performance of the "push" operation (Civi to Mailchimp) is quite poor. MySQL frequently times out during the API call in CRM_Mailchimp_Sync::collectCiviCrm(). (My MySQL query timeout is 10 minutes.)

The culprit is the following code:

$result = civicrm_api3('Contact', 'get', [
      'is_deleted' => 0,
      // The email filter in comment below does not work (CRM-18147)
      // 'email' => array('IS NOT NULL' => 1),
      // Now I think that on_hold is NULL when there is no e-mail, so if
      // we are lucky, the filter below implies that an e-mail address
      // exists ;-)
      'is_opt_out' => 0,
      'do_not_email' => 0,
      'on_hold' => 0,
      'is_deceased' => 0,
      'group' => $this->membership_group_id,
      'return' => ['first_name', 'last_name', 'group'],
      'options' => ['limit' => 0],
      //'api.Email.get' => ['on_hold'=>0, 'return'=>'email,is_bulkmail'],
    ]);

The 'return' => ['group'] in this context results in a very inefficient mysql query.

The following code has fixed the performance problem for me, by querying groups in a separate API call:

$result = civicrm_api3('Contact', 'get', [
      'is_deleted' => 0,
      // The email filter in comment below does not work (CRM-18147)
      // 'email' => array('IS NOT NULL' => 1),
      // Now I think that on_hold is NULL when there is no e-mail, so if
      // we are lucky, the filter below implies that an e-mail address
      // exists ;-)
      'is_opt_out' => 0,
      'do_not_email' => 0,
      'on_hold' => 0,
      'is_deceased' => 0,
      'group' => $this->membership_group_id,
      'return' => ['first_name', 'last_name'],
      'options' => ['limit' => 0],
      //'api.Email.get' => ['on_hold'=>0, 'return'=>'email,is_bulkmail'],
    ]);

    if ($result['count'] == 0) {
      // No-one is in the group according to CiviCRM.
      return 0;
    }

    // JMW pull groups in separate API query for performance purposes
    $g_result = civicrm_api3('Contact', 'get', [
      'contact_id' => ['IN' => array_keys($result['values'])],
      'return' => ['group'],
      'options' => ['limit' => 0],
    ]);
    foreach ($g_result['values'] as $cid => $gm) {
      $result['values'][$cid]['groups'] = $gm['groups'];
    }