magento / magento2

Prior to making any Submission(s), you must sign an Adobe Contributor License Agreement, available here at: https://opensource.adobe.com/cla.html. All Submissions you make to Adobe Inc. and its affiliates, assigns and subsidiaries (collectively “Adobe”) are subject to the terms of the Adobe Contributor License Agreement.
http://www.magento.com
Open Software License 3.0
11.38k stars 9.28k forks source link

Root category url_key causes full table scans #30519

Open george-vlahakis opened 6 years ago

george-vlahakis commented 6 years ago

Having a database with approx 4K categories (anchors) and 16K products spread throughout the tree, making a url_key change on a root category causes significant delays (in hours) to update children products and re-generate URL rewrites.

In specific:

When changing the URL key of a root category (say level 1) it will trigger a cascading regeneration of all url rewrites for all products within. This is expected behavior. However, the process increasingly slows down as the size of catalog_url_rewrite_product_category increases. I have seen this process take over 5 hours!

To replicate, and test, I moved to using Oleg's extension (here: https://github.com/olegkoval/magento2-regenerate_url_rewrites) which in essence does the same thing.

Regenerating url rewrites took over 8 hours for two stores.

Further investigation with slow_query enabled showed that DELETE sql calls on https://github.com/olegkoval/magento2-regenerate_url_rewrites before inserts are causing full table scans. Example:

DELETE FROM `catalog_url_rewrite_product_category` 
WHERE
    (url_rewrite_id IN ((SELECT 
        `url_rewrite`.`url_rewrite_id`
    FROM
        `url_rewrite`
    WHERE
        (`entity_id` IN ('13614' , '13615', '13617'))
        AND (`store_id` IN (2))
        AND (`entity_type` IN ('product')))));

The above took over 2 seconds to execute and causes a full table scan on catalog_url_rewrite_product_category:

image

I tried adding a key on url_rewrite_id (there is none) but had no speed changes and a full table scan is still preferred by MySQL.

Assuming, Magento 2.x correctly produces the multiple url rewrites (creating a huge url_rewrites table) then this behavior renders Magento completely unusable for large datasets.

Preconditions

  1. Magento 2.2
  2. MySQL 5.6.35
  3. MAC OSX (behavior is not relevant to OS)

Steps to reproduce

  1. Have a large dataset (above 12K products and 3K categories all anchors)
  2. Try changing the URL key on a root category

Expected result

  1. Better performance
  2. Usage of indexes
  3. Progress report

Actual result

  1. Considerable delays, timeouts and eventual server death
magento-engcom-team commented 6 years ago

@georgios-2317, thank you for your report. We've created internal ticket(s) MAGETWO-83973 to track progress on the issue.

george-vlahakis commented 5 years ago

Further investigation in this issue proved that the offending code is in Product::removeMultipleByProductCategory. If we switch to a table JOIN instead of using IN for the ids the SQL runs with index usage and performs as expected. However it seems that Zend doesn't support this and would break compatibility. Instead I went ahead and loaded the matching url_rewrite ids into memory (can not think of a situation where this would be too much to handle). One could switch to a tmp table if needed I suppose.

Anyhow, this is where I am at now (and back at full speed - no more full table scans): Magento\CatalogUrlRewrite\Model\ResourceModel\Category\Product.php:79

public function removeMultipleByProductCategory(array $filter)
    {
        $idsToDelete = $this->prepareSelect($filter)->query()->fetchAll(\Zend_Db::FETCH_COLUMN, 0);

        if (sizeof($idsToDelete) > 0) {

            $idsToDelete = implode(",", $idsToDelete);

            return $this->getConnection()->delete(
                $this->getTable(self::TABLE_NAME),
                ['url_rewrite_id in (' . $idsToDelete . ')']
            );
        }

        return 0;

    }
m2-assistant[bot] commented 3 years ago

Hi @george-vlahakis. Thank you for your report. To help us process this issue please make sure that you provided the following information:

Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:

@magento give me 2.4-develop instance - upcoming 2.4.x release

For more details, please, review the Magento Contributor Assistant documentation.

Please, add a comment to assign the issue: @magento I am working on this


:clock10: You can find the schedule on the Magento Community Calendar page.

:telephone_receiver: The triage of issues happens in the queue order. If you want to speed up the delivery of your contribution, please join the Community Contributions Triage session to discuss the appropriate ticket.

:movie_camera: You can find the recording of the previous Community Contributions Triage on the Magento Youtube Channel

:pencil2: Feel free to post questions/proposals/feedback related to the Community Contributions Triage process to the corresponding Slack Channel

magento-engcom-team commented 3 years ago

:white_check_mark: Confirmed by @sivaschenko Thank you for verifying the issue. Based on the provided information internal tickets MC-38500 were created

Issue Available: @sivaschenko, You will be automatically unassigned. Contributors/Maintainers can claim this issue to continue. To reclaim and continue work, reassign the ticket to yourself.