magento / inventory

Magento Inventory Project (a.k.a MSI)
Open Software License 3.0
337 stars 248 forks source link

Simultaneous changes that require indexing will collide and MySQL will throw errors #2933

Open cmacdonald-au opened 4 years ago

cmacdonald-au commented 4 years ago

Preconditions (*)

  1. Magento2 2.4-develop
  2. Inventory 1.2-develop
  3. 100+ sources
  4. 1000+ items assigned to 50% or more of those sources

Steps to reproduce (*)

  1. With all sources assigned to a stock.
  2. Execute the following scenarios multiple times, simultaneously;
    • Remove a source item assignment
    • Add a source item assignment
    • Change the qty levels for a source item across multiple sources

Expected result (*)

  1. The indexer should manage multiple attempts to update the index seamlessly
  2. Data should become eventually consistent

Actual result (*)

  1. Indexer processing collides in the database;
report.CRITICAL: SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry '-BED-Q-56-67614' for key 'PRIMARY', query was: INSERT  INTO `inventory_stock_3_replica`....
report.CRITICAL: SQLSTATE[42S02]: Base table or view not found: 1146 Table 'm2_core.inventory_stock_3_replica' doesn't exist, query was: INSERT  INTO `inventory_stock_3_replica....
  1. Index data is left in an inconsistent/unknown state
  2. System performance is negatively impacted
maghamed commented 4 years ago

Here is a dump of our conversation in Slack (as it will be eventually rotated and vanished there )


Colin Macdonald  2 hours ago
From where I'm sitting, the root cause is that realtime reindexing is enforced in some areas, but not others.
Meaning, it's possible to have a "burst" of activity that triggers contention on the index process.
Even if I was to batch and defer the individual updates (that we control via automated updates) that were triggering the indexer, it's still highly likely that an index table collision would occur based on "other" activity.
Eg; if the indexer is running based on schedule, and replenishment happens manually through admin, or interaction with an order, indexing gets triggered.
Now we have two processes attempting to index - one of those will "lose", leading to deadlock/stale index.
It's 7am here now, will get a coffee and get my brain ticking over :)

Igor Miniailo:slowpoke:  2 hours ago
no-no, I totally got your business case

Igor Miniailo:slowpoke:  2 hours ago
I think what I described here might work for you - https://magentocommeng.slack.com/archives/C5FU5E2HY/p1586035496109000?thread_ts=1586025723.097100&cid=C5FU5E2HY

Igor Miniailo
potentially, if you manage sources and source quantity in an external system. You might provide new Inventory indexer and overwrite the out-of the box one. This index will fill up Stock tables in Magento gathering data from external system.
In this customization you need to send reservation/order to the external system to make a decision which sources should be used to fulfill order placed on Magento side. Thus all the source qty management will happen not in Magento, but in external system.
And last but not least - you need to prevent any Source Item manipulation on Magento side. Currently we have that when Sipment is created or Credit Memo created.
From a thread in #msi | Today at 4:24 PM | View reply

Igor Miniailo:slowpoke:  2 hours ago
as in this case only one party would be responsible for Source deduction - an external system. While Magento would always work with Stock and Reservations

Igor Miniailo:slowpoke:  2 hours ago
btw, are you getting to the situation with Deadlocks / wait locks on Inventory index now?

Igor Miniailo:slowpoke:  2 hours ago
I will go make my afternoon coffee as well

Colin Macdonald  2 hours ago
Yeah, deadlock and wait locks are my current pain.

Igor Miniailo:slowpoke:  1 hour ago
cc //@Stanislav Idolov
can you describe deadlock scenarios ? I might expect wait locks here, but deadlocks are definitely things we need to eliminate

Stanislav Idolov  1 hour ago
I’ll take a look at it, thank you for the input!

Colin Macdonald  1 hour ago
I have another scenario that’s actually more pressing and highlights why I was looking at the “ignore indexing preferences in some scenarios”.

Colin Macdonald  1 hour ago
The forced index process can be fired multiple times at once, leading to multiple attempts at creating/inserting/dropping the replica table happening at the same time

Colin Macdonald  1 hour ago
I’ll summarise the result, but it looks like this;

Colin Macdonald  1 hour ago
report.CRITICAL: SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry '-BED-Q-56-67614' for key 'PRIMARY', query was: INSERT  INTO `inventory_stock_3_replica`....
report.CRITICAL: SQLSTATE[42S02]: Base table or view not found: 1146 Table 'm2_core.inventory_stock_3_replica' doesn't exist, query was: INSERT  INTO `inventory_stock_3_replica

Colin Macdonald  1 hour ago
(I’ve checked, the inventory data is clean)

Colin Macdonald  1 hour ago
it’s just that ReindexAfterSourceItemsDeletePlugin && `ReindexAfterSourceItemsSavePlugin` force the indexer run.

Igor Miniailo:slowpoke:  1 hour ago
It looks like something we need to investigate @Stanislav Idolov . Btw, @col don’t you want us to help in investigation/fixing this issue?

Colin Macdonald  1 hour ago
I do, yeah

Colin Macdonald  1 hour ago
Is why I’m here on a Sunday morning :slightly_smiling_face:
:+1:
1

Colin Macdonald  1 hour ago
the deadlocks are triggered thanks to this btw;

Colin Macdonald  1 hour ago
$connection->insertOnDuplicate($tableName, $batchDocuments, $columns);
But again, it’s purely about multi-request contention. Each request operates as if it’s in a single-thread world, so the database loses when a transaction gets held up “elsewhere”. (edited) 

Colin Macdonald  1 hour ago
Anyway, that’s the background for my current pain. I’ll start running through existing backlog items and see what I can do that helps alleviate some of the symptoms.

Igor Miniailo:slowpoke:  1 hour ago
@col btw another possibly valid approach is to move all re-indexation to dedicated process.
So, instead of making re-indexation in the scope of process handling the current request. We would just send items need to be re-indexed into the message queue. There would be only one indexer process responsible for re-indexation.
Thus we would prevent a possibility of race condition

Colin Macdonald  44 minutes ago
As in, schedule a reindex on demand, when it’s required, rather than rely on “on save” or “by schedule”? That’s a neat idea too.

Igor Miniailo:slowpoke:  43 minutes ago
both of these cases:
\Magento\InventoryIndexer\Plugin\InventoryApi\ReindexAfterSourceItemsDeletePlugin
\Magento\InventoryIndexer\Plugin\InventoryApi\ReindexAfterSourceItemsSavePlugin
might just send to queue SourceItems need to be re-indexed. Thus, instead of making indexation immediately
        if (count($sourceCodes)) {
            $this->sourceIndexer->executeList($sourceCodes);
        }

Igor Miniailo:slowpoke:  43 minutes ago
just send codes to process which is responsible for reindexation

Igor Miniailo:slowpoke:  42 minutes ago
moreover, it might run not immediately as well, but combine many requests at once

Colin Macdonald  41 minutes ago
yup, cool, I’ll give that a shot

Igor Miniailo:slowpoke:  41 minutes ago
so, it’s still on save as we have it now. But just prevent a situation that each process make re-indexation by its own. There would be a dedicated process responsible for reindex

Igor Miniailo:slowpoke:  41 minutes ago
it should work

Igor Miniailo:slowpoke:  39 minutes ago
moreover - that will improve performance of admin scenarios as well , as we will eliminate DB inserts out there

Igor Miniailo:slowpoke:  38 minutes ago
@col don’t hesitate to contact @vnayda if you have any questions regarding Magento or MSI. He is one of two MSI creators and luckly it’s currently his shift to support Contribution Day :slightly_smiling_face:
:tada:
1

Colin Macdonald  38 minutes ago
concur - It would solve my immediate issue without introducing significant side-effects

Igor Miniailo:slowpoke:  37 minutes ago
but it would be great to make it clean enough to be delivered in 2.4.0 :slightly_smiling_face:

Valerii Naida:magento:  22 minutes ago
Also, I will ask @Stanislav Idolov to pay attention to this thread
BTW we can use pt-deadlock-logger for deadlock queries logging (edited) 

Colin Macdonald  20 minutes ago
I won’t be able to get you indepth detail today, (we’ve rewritten it), but i’ll create an issue and get as much in there as I can.

Colin Macdonald  17 minutes ago
We only have our production use-case as a scenario as well, so I’ll work on getting something more reproducable for dev/test - this only happens when there’s a lot of simultaneous requests being processed.
``
cmacdonald-au commented 4 years ago

Options summary;

  1. Honour "Index by Schedule" for all changes

    Discarded due to known side-effects as discussed in https://github.com/magento/inventory/issues/2002

  2. Dedicated indexing processor;

    • Implement a queue to capture "immediate" index requests
    • Trigger an ad-hoc "immediate" reindex job to ensure only one reindex happens at a time
  3. Dedicated partial reindexer;

    • Same as option 2, except, only act on those items that require attention. (Partial reindex)
aligent-lturner commented 2 years ago

I'm not sure if the asynchronous indexing was introduced as a solution to this issue, but that's not the full extent of the problem - at least not anymore.

This class - https://github.com/magento/inventory/blob/develop/InventoryCatalogSearch/Plugin/InventoryIndexer/Indexer/SourceItem/Strategy/Sync/FulltextIndexUpdater.php was introduced in 2021, with a commit message of MC-42287: Items count is wrong on Category page After a stock reindex (whether synchronous or asynchronous), the full text indexer will get executed, regardless of whether or not it is in save or scheduled mode. The end result is that the same sort of problems can arise - e.g.

  1. Cron job indexer_update_all_views is running, and is in the process of updating the full text indexer for a given scope
  2. An API call results in source items being updated/deleted, which triggers the functionality mentioned above. As the first part of the executeList function, rows are deleted from the table for all affected entity ids (which could be a lot).
  3. Indexer jobs can potentially interfere with each other, e.g. deadlock, resulting in one of the processes completing, while the other does not.
  4. Result is now that the index is incomplete, as rows were removed for both jobs, but only added for a single one.

In my opinion, if an indexer is in scheduled mode, then it should only be called by the appropriate cron jobs. Rather than the inventory code calling the full text indexer directly, it would be better if it could add to the change log. This could either be done:

sdouma commented 1 year ago

This still happens