mailchimp / mc-magento

Mailchimp for Magento 1 Integration. Syncs all data (subscriber, customers, orders, products) and enables marketing automation with email campaigns, automations, ads, postcards and more.
https://mailchimp.com/integrations/magento/
Open Software License 3.0
121 stars 94 forks source link

Missing index on batch_id causing occasional deadlocks in checkout #351

Closed TroyPatteson closed 7 years ago

TroyPatteson commented 7 years ago

In _sendEcommerceBatch() called from handleEcommerceBatches() in the _mailchimp_bulksync_ecommercedata cron job there are two queries against the _mailchimp_ecommerce_syncdata table where _batchid is NULL via calls to deleteUnsentItems() and markItemsAsSent(). As there is no index on the _batchid column of _mailchimp_ecommerce_syncdata table these queries essentially require write locks on all rows. This can result in a deadlock with queries issued via the transaction from core/Mage/Sales/Model/Service/Quote.php when placing an order during the checkout process. That transaction includes saving the customer, customer address and quote models and the order model before and after payment has been made which can take many seconds. Each of those saves invokes the corresponding MailChimp observer methods to update records in the _mailchimp_ecommerce_syncdata table. If the cron job runs during the checkout process it can get locks on rows in the _mailchimp_ecommerce_syncdata table required by the checkout transaction causing a deadlock. From experience this causes the order save in the checkout transaction to be rolled back and when the order status is subsequently saved it results in a foreign key constraint violation as the order ID the status refers to no longer exists. The end result is that the payment has been taken but the order fails. Attempting to checkout again causes a duplicate payment transaction error as the payment has already been made.

To further clarify this is happening here is the part of the output on SHOW ENGINE INNODB STATUS relating to the last deadlock at a time when the checkout failed after payment was taken with an integrity constraint violation on order_status_history:

LATEST DETECTED DEADLOCK
------------------------
2017-06-26 22:40:37 7fe512dbf700
*** (1) TRANSACTION:
TRANSACTION 4670917087, ACTIVE 8 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 5011 lock struct(s), heap size 620072, 707831 row lock(s)
MySQL thread id 6610395, OS thread handle 0x7fe512cca700, query id 666947521 1.2.3.5 cb73bb0a5e4a updating
DELETE FROM `mailchimp_ecommerce_sync_data` WHERE (batch_id IS NULL)
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 26631 page no 8747 n bits 112 index `PRIMARY` of table `production`.`mailchimp_ecommerce_sync_data` trx table locks 1 total table locks 2  trx id 4670917087 lock_mode X waiting lock hold time 7 wait time before grant 0 
*** (2) TRANSACTION:
TRANSACTION 4670917128, ACTIVE 7 sec starting index read
mysql tables in use 1, locked 1
53 lock struct(s), heap size 6544, 50 row lock(s), undo log entries 41
MySQL thread id 6610487, OS thread handle 0x7fe512dbf700, query id 666948677 production.sanitised.com 1.2.3.4 cb73bb0a5e4a updating
UPDATE `mailchimp_ecommerce_sync_data` SET `related_id` = '855', `type` = 'PRO', `mailchimp_store_id` = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX', `mailchimp_sync_error` = '', `mailchimp_sync_delta` = '2017-06-27 02:35:18', `mailchimp_sync_modified` = '1', `mailchimp_sync_deleted` = '0', `mailchimp_token` = '', `batch_id` = 'ddc3ea2ae6' WHERE (id='649')
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 26631 page no 8747 n bits 112 index `PRIMARY` of table `production`.`mailchimp_ecommerce_sync_data` trx table locks 27 total table locks 2  trx id 4670917128 lock_mode X locks rec but not gap lock hold time 7 wait time before grant 0 
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 26631 page no 8 n bits 208 index `PRIMARY` of table `production`.`mailchimp_ecommerce_sync_data` trx table locks 27 total table locks 2  trx id 4670917128 lock_mode X locks rec but not gap waiting lock hold time 0 wait time before grant 0 
*** WE ROLL BACK TRANSACTION (2)

Note that DELETE causing 707831 row locks....

And a second one on the UPDATE query on a different magento store:

------------------------
LATEST DETECTED DEADLOCK
------------------------
2017-06-27 00:40:14 7f7e93035700
*** (1) TRANSACTION:
TRANSACTION 3218640656, ACTIVE 0 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 1626 lock struct(s), heap size 210472, 211782 row lock(s)
MySQL thread id 24850295, OS thread handle 0x7f7e8851f700, query id 1539633190 1.2.3.5 abf343d9a6fa updating
UPDATE `mailchimp_ecommerce_sync_data` SET `batch_id` = '9eb1f1f66a' WHERE (batch_id IS NULL AND mailchimp_store_id = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX')
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 4793 page no 2643 n bits 192 index `PRIMARY` of table `production`.`mailchimp_ecommerce_sync_data` trx table locks 1 total table locks 2  trx id 3218640656 lock_mode X waiting lock hold time 0 wait time before grant 0 
*** (2) TRANSACTION:
TRANSACTION 3218639087, ACTIVE 5 sec starting index read
mysql tables in use 1, locked 1
56 lock struct(s), heap size 13864, 49 row lock(s), undo log entries 37
MySQL thread id 24850345, OS thread handle 0x7f7e93035700, query id 1539633446 production.sanitised.com 1.2.3.4 abf343d9a6fa updating
UPDATE `mailchimp_ecommerce_sync_data` SET `related_id` = '1884', `type` = 'PRO', `mailchimp_store_id` = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX', `mailchimp_sync_error` = 'Bad Request : We were unable to process the request. A product with the provided ID already exists in the account.', `mailchimp_sync_delta` = '2017-06-23 03:10:09', `mailchimp_sync_modified` = '1', `mailchimp_sync_deleted` = '0', `mailchimp_token` = '', `batch_id` = 'ab4db25331' WHERE (id='227389')
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 4793 page no 2643 n bits 192 index `PRIMARY` of table `production`.`mailchimp_ecommerce_sync_data` trx table locks 28 total table locks 2  trx id 3218639087 lock_mode X locks rec but not gap lock hold time 5 wait time before grant 0 
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 4793 page no 1249 n bits 168 index `PRIMARY` of table `production`.`mailchimp_ecommerce_sync_data` trx table locks 28 total table locks 2  trx id 3218639087 lock_mode X locks rec but not gap waiting lock hold time 0 wait time before grant 0 
*** WE ROLL BACK TRANSACTION (2)

The UPDATE caused 211782 row locks before the deadlock discovered.

The solution is obviously to create an index on _batchid with something like:

ALTER TABLE mailchimp_ecommerce_sync_data ADD INDEX `IDX_M4M_ECOMMERCE_SYNC_DATA_BATCH_ID` (`batch_id`);

Note that we are using the Idev_OneStepCheckout extension but I am not sure it will matter which checkout extension is used.

I don't know whether the index will eliminate all deadlocks but it is certainly a good start and also speeds up those DELETE and UPDATE queries run in the CRON job.

centerax commented 7 years ago

Thanks for this feedback Troy.

We will review this an provide our thoights asap

TroyPatteson commented 7 years ago

By the way, we haven't had another deadlock since creating that index on any of our 3 stores.

Santiagoebizmarts commented 7 years ago

Hello,

Thanks for your feedback. We will implement this in the next version.

Regards.