magento / magento2

Prior to making any Submission(s), you must sign an Adobe Contributor License Agreement, available here at: https://opensource.adobe.com/cla.html. All Submissions you make to Adobe Inc. and its affiliates, assigns and subsidiaries (collectively “Adobe”) are subject to the terms of the Adobe Contributor License Agreement.
http://www.magento.com
Open Software License 3.0
11.54k stars 9.32k forks source link

Price indexer slow for configurable BaseStockStatusSelectProcessor #33763

Closed webtekindo closed 2 years ago

webtekindo commented 3 years ago

Preconditions (*)

  1. Magento 2.4.2
  2. Catalog with lot of configurable products

Steps to reproduce (*)

  1. Reindex catalog_product_price

Expected result (*)

  1. Fast

Actual result (*)

  1. Slow (in my case it never ends)

While monitoring the queries while reindexing, I notice these queries took long time:

SELECT le.entity_id, i.customer_group_id, i.website_id, MIN( final_price ), MAX( final_price ), MIN( tier_price ) FROM catalog_product_index_price_replica AS i INNER JOIN catalog_product_super_link AS l ON l.product_id = i.entity_id INNER JOIN catalog_product_entity AS le ON le.entity_id = l.parent_id INNER JOIN cataloginventory_stock_item AS si ON si.product_id = l.product_id INNER JOIN cataloginventory_stock_item AS si_parent ON si_parent.product_id = l.parent_id WHERE ( si.is_in_stock = 1 ) OR ( si_parent.is_in_stock = 0 ) AND ( le.entity_id IN ( 171648, 171655 )) GROUP BY le.entity_id, customer_group_id, website_id

Notice the where condition:

( si.is_in_stock = 1 ) OR ( si_parent.is_in_stock = 0 ) AND ( le.entity_id IN ( 171648, 171655 ))

This is very slow, it returns (in my case 70.000+ products) the same products over and over for all the configurable product that this query is executed for.

It should be:

(( si.is_in_stock = 1 ) OR ( si_parent.is_in_stock = 0 )) AND ( le.entity_id IN ( 171648, 171655 ))

With this solution the result are only about the mention configurable product in entity_id.

Looks like the issue is in BaseStockStatusSelectProcessor

We have 300.000+ products and around 30.000+ configurable products. With this fix we can reindex in 9 minutes, without it we never manage to reindex until success.


Please provide Severity assessment for the Issue as Reporter. This information will help during Confirmation and Issue triage processes.

m2-assistant[bot] commented 3 years ago

Hi @webtekindo. Thank you for your report. To help us process this issue please make sure that you provided the following information:

Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:

@magento give me 2.4-develop instance - upcoming 2.4.x release

For more details, please, review the Magento Contributor Assistant documentation.

Please, add a comment to assign the issue: @magento I am working on this


:clock10: You can find the schedule on the Magento Community Calendar page.

:telephone_receiver: The triage of issues happens in the queue order. If you want to speed up the delivery of your contribution, please join the Community Contributions Triage session to discuss the appropriate ticket.

:movie_camera: You can find the recording of the previous Community Contributions Triage on the Magento Youtube Channel

:pencil2: Feel free to post questions/proposals/feedback related to the Community Contributions Triage process to the corresponding Slack Channel

m2-assistant[bot] commented 3 years ago

Hi @engcom-Lima. Thank you for working on this issue. In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:

engcom-Lima commented 3 years ago

Hi @webtekindo,

Since the number of configurable products on 2.4 develop instance with sample data is not as much so wanted to check with you if there is any other way we can reproduce this on Magento 2.4-develop instance ?

Here is what we tried as of now:

  1. We reindexed catalog_product_price with sample data and reindexing worked fine.

So please check on 2.4 develop instance if you are getting the same issue with your data ?

webtekindo commented 3 years ago

@engcom-Lima if you run the index this will generate queries like this:

SELECT le.entity_id, i.customer_group_id, i.website_id, MIN( final_price ), MAX( final_price ), MIN( tier_price ) FROM catalog_product_index_price_replica AS i INNER JOIN catalog_product_super_link AS l ON l.product_id = i.entity_id INNER JOIN catalog_product_entity AS le ON le.entity_id = l.parent_id INNER JOIN cataloginventory_stock_item AS si ON si.product_id = l.product_id INNER JOIN cataloginventory_stock_item AS si_parent ON si_parent.product_id = l.parent_id WHERE ( si.is_in_stock = 1 ) OR ( si_parent.is_in_stock = 0 ) AND ( le.entity_id IN ( 171648, 171655 )) GROUP BY le.entity_id, customer_group_id, website_id

Where entity_id is from a configurable product. Just replace the entity_id by one configurable product from sample data, and execute the query , you will notice the result is incorrect. It will return lot of simple product not related to the configurable choosen.

See my result: https://prnt.sc/1qghqbu (these are not related to my configurable product)

It should be:

SELECT le.entity_id, i.customer_group_id, i.website_id, MIN( final_price ), MAX( final_price ), MIN( tier_price ) FROM catalog_product_index_price_replica AS i INNER JOIN catalog_product_super_link AS l ON l.product_id = i.entity_id INNER JOIN catalog_product_entity AS le ON le.entity_id = l.parent_id INNER JOIN cataloginventory_stock_item AS si ON si.product_id = l.product_id INNER JOIN cataloginventory_stock_item AS si_parent ON si_parent.product_id = l.parent_id WHERE (( si.is_in_stock = 1 ) OR ( si_parent.is_in_stock = 0 )) AND ( le.entity_id IN ( 171648, 171655 )) GROUP BY le.entity_id, customer_group_id, website_id

See my result: https://prnt.sc/1qghqz1 (these are correctly related to my configurable product)

The AND operator have higher precedence than the OR in MySQL.

This will improve indexing by a lot.

Hope this is clear.

engcom-Lima commented 3 years ago

Hi @webtekindo,

Is your issue anywhere related to #22516 ? Because it seems like duplicate and it was fixed. Please check.

Also please advise how much time your server is taking and also about your server configurations.

Thanks

webtekindo commented 3 years ago

@engcom-Lima it's not related, we don't use MSI.

You can try edit:

https://github.com/magento/magento2/blob/2.4-develop/app/code/Magento/ConfigurableProduct/Model/ResourceModel/Product/Indexer/Price/BaseStockStatusSelectProcessor.php#L66

And "echo" the query

Reindex price you will see something like this:

SELECT le.entity_id, i.customer_group_id, i.website_id, MIN( final_price ), MAX( final_price ), MIN( tier_price ) FROM catalog_product_index_price_replica AS i INNER JOIN catalog_product_super_link AS l ON l.product_id = i.entity_id INNER JOIN catalog_product_entity AS le ON le.entity_id = l.parent_id INNER JOIN cataloginventory_stock_item AS si ON si.product_id = l.product_id INNER JOIN cataloginventory_stock_item AS si_parent ON si_parent.product_id = l.parent_id WHERE ( si.is_in_stock = 1 ) OR ( si_parent.is_in_stock = 0 ) AND ( le.entity_id IN ( 171648 )) GROUP BY le.entity_id, customer_group_id, website_id

This query is not correct as I mentionned above. Just try that query in your database and you will quickly understand why it's not correct.

( si.is_in_stock = 1 ) OR ( si_parent.is_in_stock = 0 ) AND ( le.entity_id IN ( 171648 )

in MySQL means

( si.is_in_stock = 1 ) OR ( ( si_parent.is_in_stock = 0 ) AND ( le.entity_id IN ( 171648 ) )

and it should be:

(( si.is_in_stock = 1 ) OR ( si_parent.is_in_stock = 0 )) AND ( le.entity_id IN ( 171648 )

engcom-Lima commented 3 years ago

Hi @webtekindo,

The issue which I referred also didn't have MSI installed. And I tried reindexing the price by printing the query which you referred in last comment but not able to get the same query as you have provided.

Anyways, in order to do some further analysis please provide the below details also:

  1. Your server configurations.
  2. How much time your server spent on reindexing 30000+ configurable products ?
webtekindo commented 3 years ago

Hi @engcom-Lima,

Very strange you don't get the same query as it's literally what is standing here:

https://github.com/magento/magento2/blob/2.4-develop/app/code/Magento/ConfigurableProduct/Model/ResourceModel/Product/Indexer/Price/BaseStockStatusSelectProcessor.php#L63

That function add some join to the query, and if Magento add more Join to the same query or custom, then we will have a situation where the AND operator have higher precedence than the OR in MySQL, so this should be fixed anyway even not reproducible in default Magento. But in our case I don't think we have something overiding the default logic of the indexer.

Did you enable show out of stock?

Our server configuration is 32 CPU cores and 42G of RAM, we have more than 30.000 configurable product, but note that some of our configurable product have more than 30.000 variants in it (which is not a classic situation).

Without the fix implemented it takes forever, I don't know how long because it was never ending for sure more than 4 hours. With the fix implemented it takes around 10 minutes.

engcom-Lima commented 3 years ago

Hi @webtekindo,

Thank you for your input but I won't be able to confirm this issue until I reproduce this on my system.

Please check below screenshot of the query which I am getting after I am printing the query in BaseStockStatusSelectProcessor class as you suggested. In the below screenshot you can see I am not getting AND condition as you are getting.

Screenshot 2021-09-22 at 3 15 17 PM

Please let me know if you did anything differently or I missed anything ? Also can you please confirm if you are getting this issue on 2.4-develop version also or getting on 2.4.2 only ?

Thanks

engcom-Lima commented 3 years ago

Hi @webtekindo,

We have noticed that this issue has not been updated for some time. Since we are not able to reproduce it, we are closing it. Please raise a fresh ticket or reopen this ticket if you need more assistance on this.

Regards

steven-hoffman-jomashop commented 2 years ago

Hi @engcom-Lima,

We see this issue on 2.4.3:

INSERT INTO `catalog_product_index_price_cfg_opt_temp`
SELECT `le`.`entity_id`, `i`.`customer_group_id`, `i`.`website_id`, MIN(final_price), MAX(final_price), MIN(tier_price)
FROM `catalog_product_index_price` AS `i`
INNER JOIN `catalog_product_super_link` AS `l` ON l.product_id = i.entity_id
INNER JOIN `catalog_product_entity` AS `le` ON le.row_id = l.parent_id AND (le.created_in <= '1637600400' AND le.updated_in > '1637600400')
INNER JOIN `cataloginventory_stock_item` AS `si` ON si.product_id = l.product_id
INNER JOIN `cataloginventory_stock_item` AS `si_parent` ON si_parent.product_id = l.parent_id
WHERE (si.is_in_stock = 1) OR (si_parent.is_in_stock = 0) AND (le.entity_id IN (502243, 251917, 251927, 691473, 691475, 691476, 691507, 691505))
GROUP BY `le`.`entity_id`,`customer_group_id`,`website_id` 
ON DUPLICATE KEY UPDATE `min_price` = VALUES(`min_price`), `max_price` = VALUES(`max_price`), `tier_price` = VALUES(`tier_price`)

You can see that the WHERE clause is WHERE X OR Y AND Z; as per mysql docs AND is higher precedence then OR. Thus the above WHERE clause WHERE X OR Y AND Z is really: WHERE X OR (Y AND Z) which is not correct.

You can see that \Magento\ConfigurableProduct\Model\ResourceModel\Product\Indexer\Price\Configurable::fillTemporaryOptionsTable adds a where clause on the le.entity_id.

Also, you can see that the $this->baseSelectProcessor->process($select) call is made before the where adds the le.entity_id filter. (Line 214 vs line 229). You should be able to 'reproduce' by echoing the select after line 230; though I am not 100% of the sequence needed to hit line 229. (Full reindex vs partial). (You definitely need a configurable with options)


Also, in general may I recomend that you use the dev:query-log:enable command to track all query made to the DB?

steven-hoffman-jomashop commented 2 years ago

@engcom-Lima, How can we re-open this issue?

m2-assistant[bot] commented 2 years ago

Hi @engcom-Echo. Thank you for working on this issue. In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:


engcom-Echo commented 2 years ago

Hi @steven-hoffman-jomashop,

Thank you for providing this info.

Based on it, I enabled db logging and did reindexing after enabling 'Display Out of Stock Products' from Admin Panel. I checked all queries that were made to db in logs for Magento 2.4-develop and the query that I got is below:

INSERT INTO `catalog_product_index_price_cfg_opt_temp` SELECT `le`.`entity_id`, `i`.`customer_group_id`, `i`.`website_id`, 
MIN(final_price), MAX(final_price), MIN(tier_price) FROM `catalog_product_index_price_replica` AS `i`
INNER JOIN `catalog_product_super_link` AS `l` ON l.product_id = i.entity_id
INNER JOIN `catalog_product_entity` AS `le` ON le.entity_id = l.parent_id
INNER JOIN `cataloginventory_stock_status` AS `child_stock_default` ON child_stock_default.product_id = l.product_id
INNER JOIN `cataloginventory_stock_status` AS `parent_stock_default` ON parent_stock_default.product_id = le.entity_id
WHERE (le.entity_id IN (69987,69988,.......)) AND (child_stock_default.stock_status = 1 OR parent_stock_default.stock_status = 0) GROUP BY `le`.`entity_id`,`customer_group_id`,`website_id` 
ON DUPLICATE KEY UPDATE `min_price` = VALUES(`min_price`), `max_price` = VALUES(`max_price`), `tier_price` = VALUES(`tier_price`)

Above query is made with 1000+ configurable products but I am commenting based on the query logic which is in question. So, as you can see AND OR combination is already correct as per above query to DB. For confirming this, you can also check the same thing in Magento latest version or you can also comment if I missed anything in finding the required query. Accordingly I will check if this issue is there in the older versions like 2.4.2 and 2.4.3.

Thanks

steven-hoffman-jomashop commented 2 years ago

Hi @engcom-Echo,

It seems that the codebase was recently changed in commit https://github.com/magento/magento2/commit/8fe11d7c55c4fd44fbd180e6688aab2ba39f09c0#diff-1d5f1e397520e3931003c3ac5f9ca5a547e23b67640799f9ec824eeebaeaf03f.

See: (On dec 19) https://github.com/magento/magento2/commit/e8c31f5db58fa82a75b3714e011fc95b0ab219b9

It does appear that there was an internal ticket that was being worked on; it does not appear to be visible on this platform. (GitHub issues or pull requests)

If you can confirm that this is a bug on version 2.4.3-p1 and lower that seems to be the next helpful step.

-SH

m2-assistant[bot] commented 2 years ago

Hi @engcom-Hotel. Thank you for working on this issue. In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:

engcom-Hotel commented 2 years ago

Hello @steven-hoffman-jomashop @webtekindo,

We will try to reproduce the issue in Magento 2.4-develop as well as 2.4.3-p1 and get back to you on this.

Thanks

engcom-Hotel commented 2 years ago

Hello @webtekindo,

We have tried to reproduce this issue in Magento 2.4-develop branch with enable query log. But for me also the same query is not available there in the logs.

It might be possible that the issue is resolved in the current develop branch. We request you to please try to reproduce the issue in Magento 2.4-develop branch and let us know in case you are still able to reproduce the issue.

Thanks

engcom-Hotel commented 2 years ago

Dear @webtekindo,

We have noticed that this issue has not been updated for a period of 14 Days. Hence we assume that this issue is fixed now, so we are closing it. Please raise a fresh ticket or reopen this ticket if you need more assistance on this.

Regards

sebfie commented 1 year ago

@steven-hoffman-jomashop Did you make a patch or something? Hard to find how to fix it!