magento / magento2

Prior to making any Submission(s), you must sign an Adobe Contributor License Agreement, available here at: https://opensource.adobe.com/cla.html. All Submissions you make to Adobe Inc. and its affiliates, assigns and subsidiaries (collectively “Adobe”) are subject to the terms of the Adobe Contributor License Agreement.
http://www.magento.com
Open Software License 3.0
11.39k stars 9.29k forks source link

2.3 Customer module Recurring setup script performance problems. #19469

Closed vbuck closed 4 years ago

vbuck commented 5 years ago

When running bin/magento setup:upgrade for a Magento CE 2.3.x installation(or just use Magento Open source), there is an unexpected delay in the recurring setup script execution on the Magento_Customer module(every time when you run bin/magento setup:upgrade) . This is more pronounced on a large data set (>500K customers).

References

Preconditions (*)

  1. Magento CE 2.2.x(or 2.3.x) -> 2.3.x upgraded codebase (pre DB upgrade)
  2. A large customer database (>500K records).

Steps to reproduce (*)

  1. After codebase upgrade, proceed to run bin/magento setup:upgrade
  2. Observe execution delay on process step:
    Module 'Magento_Customer':
    Running data recurring...

Repeat these steps and you will notice, since there is a recurring upgrade script, that it runs every time.

Expected result (*)

  1. No recurring data scripts run, or they are or more performant.

Actual result (*)

  1. Recurring data scripts run with each attempt to upgrade the DB.

After ending of update you can run again bin/magento setup:upgrade and you will meet this problem again. I am not sure of the need/reason to run a recurring upgrade, but from the reference posted at the top of this issue it's clear the intent to is to handle reindexing on upgrades. This seems unwise and gives room for abusing recurring upgrade scripts with patch-like behavior or long-running processes which can delay deployment times.

Do you have any background regarding the nature of the change?

magento-engcom-team commented 5 years ago

Hi @vbuck. Thank you for your report. To help us process this issue please make sure that you provided the following information:

Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:

@magento-engcom-team give me $VERSION instance

where $VERSION is version tags (starting from 2.2.0+) or develop branches (for example: 2.3-develop). For more details, please, review the Magento Contributor Assistant documentation.

@vbuck do you confirm that you was able to reproduce the issue on vanilla Magento instance following steps to reproduce?

magento-engcom-team commented 5 years ago

Hi @engcom-backlog-nazar. Thank you for working on this issue. In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:

ghost commented 5 years ago

Hi @vbuck , thank you for your report. Please follow these guidelines for proper tracking of your issue. You can report Commerce-related issues in one of two ways: You can use the Support portal associated with your account or If you are a Partner reporting on behalf of a merchant, use the Partner portal.

GitHub is intended for Magento Open Source users to report on issues related to Open Source only. There are no account management services associated with GitHub.

vbuck commented 5 years ago

@engcom-backlog-nazar Understood. Admittedly this issue may be a bit misdirected. However, my reason for starting a discussion in the Open Source forums was two-fold:

That said, you may keep this issue closed and I will forward to the partner portal.

ishakhsuvarov commented 5 years ago

Reopening, to verify with vanilla CE instance and 500k customer accounts.

magento-engcom-team commented 5 years ago

@engcom-backlog-nazar Thank you for verifying the issue. Based on the provided information internal tickets MAGETWO-96971 were created

sdzhepa commented 5 years ago

Hello @vbuck

I see that all internal tickets related to this issue were closed. And I suppose that issue has been resolved also.

Please, feel free to reopen or create a new one if issue still exists or was not fully fixed

Thank you for feedback and collaboration

dambrogia commented 5 years ago

@sdzhepa Is there any status update to what happened this ticket? You state you "suppose the issue has been resolved" but I don't see anything related to that within this thread of comments/replies. Also the tags of Reproduced on 2.3.x and Fixed in 2.2.x are quite conflicting (added on Dec 5). Tagging the issue as Ready for Work on Dec 5 and then removing the assignment on Dec 5 without any reference as to what changed is also quite confusing if this "has been resolved".

I'm using 2.3 EE and I'm seeing >1hr update times because Magento is reindexing the customer_grid index on every bin/magento setup:upgrade statement (occurs in the Magento_Customer module).

Is there a reason this needs to happen? It seems like this should not happen during the update.

The purpose of the update script is to install/update/modify schema between version. The purpose of the indexers is to enhance lookups.

The actions seem exclusive and separate from each other. Can anyone elaborate to why this reindex is needed during the update? And if it is in fact needed, Can anyone elaborate on what we can do to enhance the performance of it?

dambrogia commented 5 years ago

@sdzhepa @ishakhsuvarov @magento-engcom-team

Any updates or info on this? Do I need to create a new ticket for this?

allamsettiramesh commented 5 years ago

I did't find any solution regarding this issue.

ghost commented 5 years ago

Hi @dambrogia Hi @allamsettiramesh i'm reopen this as this was not fixed. selection_287

dambrogia commented 5 years ago

Hi @engcom-backlog-nazar thank you for re-opening this issue.

If it's not imperative that the reindex needs to happen on every setup:upgrade command, can we remove it? I also think it would be helpful to know why/when it is appropriate to reindex the users and what the thought process was behind reindexing them on every setup:upgrade.

I would be glad to help out creating a PR for removing the recurring data script if necessary.

andkirby commented 5 years ago

@dambrogia, thank you for this note. I got 3m for magento setup:upgrade just for removing a module! M2 so fast! =)

magento-engcom-team commented 5 years ago

Hi @vbuck, @slackerzz.

Thank you for your report and collaboration!

The related internal Jira ticket MAGETWO-96971 was closed as non-reproducible in 2.3.x. It means that Magento team either unable to reproduce this issue using provided Steps to Reproduce from the Description section on clean or the issue has been already fixed in the scope of other tasks.

But if you still run into this problem please update or provide additional information/steps/preconditions in the Description section and reopen this issue.

slackerzz commented 5 years ago

@magento-engcom-team so you are asserting that https://github.com/magento/magento2/blob/2.3-develop/app/code/Magento/Customer/Setup/RecurringData.php is not run on every setup:upgrade? This is a big problem for a real store, probably not noticeable in a dev box with sample data.

onepack commented 5 years ago

Data recurring is now taking 15 minutes after I did an install via composer of a new module on dev. It has always been slow but it even seems to hang now. Developing stuff is taking sooo long this way. Getting tired from waiting for the update process to finish after you deployed something tp dev and later prod. Please fix.

jeffekg commented 4 years ago

Why does Magento continue to develop new features, when their existing codebase has so many issues? This is a huge issue and was submitted back in November of 2018, at least provide an official patch please?

m2-assistant[bot] commented 4 years ago

Hi @engcom-Charlie. Thank you for working on this issue. In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:

magento-engcom-team commented 4 years ago

:white_check_mark: Confirmed by @engcom-Charlie Thank you for verifying the issue. Based on the provided information internal tickets MC-18593 were created

Issue Available: @engcom-Charlie, You will be automatically unassigned. Contributors/Maintainers can claim this issue to continue. To reclaim and continue work, reassign the ticket to yourself.

jeffekg commented 4 years ago

In the meantime, just curious if simply patching this to remove the 2 lines of code firing the indexer would cause issues?

Something like:

diff --git a/vendor/magento/module-customer/Setup/RecurringData.php b/vendor/magento/module-customer/Setup/RecurringData.php
index fbef4c0..153d08f 100644
--- a/vendor/magento/module-customer/Setup/RecurringData.php
+++ b/vendor/magento/module-customer/Setup/RecurringData.php
@@ -37,7 +37,5 @@ class RecurringData implements InstallDataInterface
      */
     public function install(ModuleDataSetupInterface $setup, ModuleContextInterface $context)
     {
-        $indexer = $this->indexerRegistry->get(Customer::CUSTOMER_GRID_INDEXER_ID);
-        $indexer->reindexAll();
     }
 }
vbuck commented 4 years ago

Trying to be productive here, but surely our frustration can only be contained for so long:

And all throughout this madness, the community is reporting no visible fix, despite having pointed to the exact place of error. @jeffekg just showed that the issue persists, and I can confirm that right up to the Open Source 2.3.2 tag it is present:

https://github.com/magento/magento2/blob/2.3.2/app/code/Magento/Customer/Setup/RecurringData.php

That fact that it was acknowledged internally as reproducible and then closed as not reproducible can only lead me to believe that there's a communications problem internally with your team(s). I get it, it happens. But I also know that sometimes it takes somebody with the power and resolve to own a single problem in order to take it to completion.

So can you please respond to this issue by considering the impact of the RecurringData script on the customer module on large data sets? Nobody seems to understand the nature or reason for this change, which is what I asked for when I opened the issue. If you can't justify it, and it causes pain for large merchants, please remove it. On principle we should not be reindexing in a recurring update. If you're doing it, you probably did something else wrong.

hostep commented 4 years ago

Guys, there is a PR over here for this issue: https://github.com/magento/magento2/pull/21235, if you scroll up a bit to 15 February, you'll see it being linked (we just need somebody to finally approve it).

vbuck commented 4 years ago

Thanks @hostep. It was my mistake for not catching that reference and it seems my heat is undue. I can see in #21235 there has been a lot of activity. However it looks like @orlangur still has some problem with that proposed solution, based on the last resolved review comment. I will continue to monitor that thread for a solution, and I'm marking this one as closed.

hostep commented 4 years ago

Please leave it open until #21235 is merged :)

(this bot removing all these labels is kind of annoying to be honest)

vbuck commented 4 years ago

No problem. I tried to lighten your load in tracking everything :)

magento-engcom-team commented 4 years ago

:white_check_mark: Confirmed by @sdzhepa Thank you for verifying the issue. Based on the provided information internal tickets MC-18593 were created

Issue Available: @sdzhepa, You will be automatically unassigned. Contributors/Maintainers can claim this issue to continue. To reclaim and continue work, reassign the ticket to yourself.

slackerzz commented 4 years ago

For those waiting for a fix i suggest to use a composer patch that removes the vendor/magento/module-customer/Setup/RecurringData.php, here's mine:

From 7e3448e72335f31cb8fd52a1fedee23b265075bb Mon Sep 17 00:00:00 2001
From: Lorenzo Stramaccia <lorenzo.stramaccia@magespecialist.it>
Date: Thu, 14 Feb 2019 17:56:34 +0100
Subject: [PATCH] Remove Magento/Customer/Setup/RecurringData.php

---
 .../Magento/Customer/Setup/RecurringData.php  | 43 -------------------
 1 file changed, 43 deletions(-)
 delete mode 100644 vendor/magento/module-customer/Setup/RecurringData.php

diff --git a/vendor/magento/module-customer/Setup/RecurringData.php b/vendor/magento/module-customer/Setup/RecurringData.php
deleted file mode 100644
index fbef4c05d126..000000000000
--- a/vendor/magento/module-customer/Setup/RecurringData.php
+++ /dev/null
@@ -1,43 +0,0 @@
-<?php
-/**
- * Copyright © Magento, Inc. All rights reserved.
- * See COPYING.txt for license details.
- */
-
-namespace Magento\Customer\Setup;
-
-use Magento\Framework\Indexer\IndexerRegistry;
-use Magento\Framework\Setup\InstallDataInterface;
-use Magento\Framework\Setup\ModuleContextInterface;
-use Magento\Framework\Setup\ModuleDataSetupInterface;
-use Magento\Customer\Model\Customer;
-
-/**
- * Upgrade registered themes.
- */
-class RecurringData implements InstallDataInterface
-{
-    /**
-     * @var IndexerRegistry
-     */
-    private $indexerRegistry;
-
-    /**
-     * Init
-     *
-     * @param IndexerRegistry $indexerRegistry
-     */
-    public function __construct(IndexerRegistry $indexerRegistry)
-    {
-        $this->indexerRegistry = $indexerRegistry;
-    }
-
-    /**
-     * {@inheritdoc}
-     */
-    public function install(ModuleDataSetupInterface $setup, ModuleContextInterface $context)
-    {
-        $indexer = $this->indexerRegistry->get(Customer::CUSTOMER_GRID_INDEXER_ID);
-        $indexer->reindexAll();
-    }
-}

Regarding the pull request i don't know what to say, we have to wait. I dont know if @orlangur is away, on holiday or something else, however i opened it the 14th of February.

vbuck commented 4 years ago

Thanks @slackerzz. Your patch is exactly what I have been applying since I first opened this ticket last year. It has not yet created any problems for me. I'll continue to wait on Magento until the core issue is resolved.

m2-assistant[bot] commented 4 years ago

Hi @Nazar65. Thank you for working on this issue. Looks like this issue is already verified and confirmed. But if you want to validate it one more time, please, go though the following instruction:


slackerzz commented 4 years ago

Let's celebrate the first birthday of this issue :birthday: :tada: :tada:

ihor-sviziev commented 4 years ago

@slackerzz we have a progress! Internal team started working on this issue! https://github.com/magento/magento2/pull/26163#issuecomment-583422293

m2-assistant[bot] commented 4 years ago

Hi @o-iegorov. Thank you for working on this issue. Looks like this issue is already verified and confirmed. But if you want to validate it one more time, please, go though the following instruction:


magento-engcom-team commented 4 years ago

Hi @vbuck, @Nazar65, @o-iegorov.

Thank you for your report and collaboration!

The issue was fixed by Magento team. The fix was delivered into magento/magento2:2.4-develop branch(es). Related commit(s):

The fix will be available with the upcoming 2.4.0 release.

slackerzz commented 4 years ago

The issue was about a reindex in a recurring script and the solution has a reindex in a recurring script.

ihor-sviziev commented 4 years ago

@slackerzz i didn’t tested it, but seems like in most cases it will not do reindex, so performance issue is resolved. Don’t you think so?

slackerzz commented 4 years ago

If the customer grid index is invalid it will perform a reindex during setup:upgrade and the store will be in maintenance for minutes during deploy. If this is the Magento solution i will update my patch to remove the new RecurringData script.

ihor-sviziev commented 4 years ago

Did you rested this new solution? Does customer grid index always invalid? I believe it should be invalid only in case if some attribute was added

On Mon, 9 Mar 2020 at 13:04, Lorenzo Stramaccia notifications@github.com wrote:

If the customer grid index is invalid it will perform a reindex during setup:upgrade and the store will be in maintenance for minutes during deploy. If this is the Magento solution i will update my patch to remove the new RecurringData script.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/magento/magento2/issues/19469?email_source=notifications&email_token=AAOJOUKNGC5ABH4YZSDHJFDRGTENTA5CNFSM4GHJ3ES2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEOGUVWI#issuecomment-596462297, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOJOUOZD5A3LIN67IAZI4DRGTENTANCNFSM4GHJ3ESQ .

o-iegorov commented 4 years ago

@slackerzz Natural state of customer grid index is valid. Reindex will be performed only in case if it is invalid that seems reasonable. If you have enableb cron (that also natural for prod environments) reindex wil be performed in bacground for invalid index, so case when you will perform setup:upgrade with invalid index is very rare and for this case reindex will be performed (that is ok for that case)

davidalger commented 4 years ago

If you have enableb cron (that also natural for prod environments) reindex wil be performed in bacground for invalid index, so case when you will perform setup:upgrade with invalid index is very rare and for this case reindex will be performed (that is ok for that case)

@o-iegorov I'm nearly certain this portion of your recent statement is factually incorrect. The logic in this recurring data upgrade (the "fixed" version in 2.4) doesn't account for schedule vs realtime separately, it simply calls reindexAll when it believes the indexer should be run: https://github.com/magento/magento2/commit/0fd8a5146cdf4e524150e68f89085d90f0d42be3#diff-0ac6816ed3ec11b7a9c59731fae99d4bR43-R44

Calling reindexAll in the above will simply result in the entire index being rebuilt regardless of the indexer mode. So the penalty still exists, and it would not (although I have not tested it) result in an asynchronous execution of the indexer. Unless I'm missing some other fundamental change in 2.4 codebase regarding indexers.

hostep commented 4 years ago

I agree with @davidalger, an extra condition in the if should be added to prevent a synchronous reindex action when the indexer is set to schedule mode. Because it's not necessary and it will be picked up asynchronously anyways.

But this is a mini optimisation, the chances of having an invalid customer grid indexer while running bin/magento setup:upgrade aren't that big probably.

ihor-sviziev commented 4 years ago

@davidalger @hostep I do agree with you that it's not perfect solution, it could be improved, but it significantly improves situation when no customer attributes were affected. Feel free to create Pull Request with improvement based on your suggestions.

o-iegorov commented 4 years ago

@davidalger I din't say nothing about indexer mode. Moreover, customer grid indexer doesn't support update by schedule - https://docs.magento.com/m2/ce/user_guide/system/index-management.html It's recall reindex when state is invalid, that's correct behavior. When indexer is in invalid sate magento cron job will perform full reindex indexer with any kind of update mode.

o-iegorov commented 4 years ago

@hostep please refer my prev comment - customer grid indexer doesn't support update by schedule

davidalger commented 4 years ago

@o-iegorov Where do you see customer grid index only supports index on save? I have this index running in production today in schedule mode. This is an index using the materialized view patterns in M2, and it would make little sense for it to only support On Save.

When indexer is in invalid sate magento cron job will perform full reindex.

This may be correct, but what I'm saying is that what the upgrade does is call reindexAll which does not mark the index as invalid, it runs the index synchronously. So IF the upgrade runs while the indexer is in an invalid state, or should the indexer be marked invalid by say an upgrade routine that's adding an attribute to the customer grid, the index will still run synchronously during the upgrade rather than simply letting the cron run the reindex to cleanup the invalid state.

@ihor-sviziev I don't have time at the moment to create a PR to further enhance this on 2.4 (unfortunately). On the 2.3 project that highlighted this for me with a 40 minute grid reindex, I'm simply deleting the Recurring Data script from the customer module as a workaround. I'm mainly posting for the sake of others as I read the original comment to infer something regarding the asynchronicity of the indexer as it relates to the setup upgrade routine.

o-iegorov commented 4 years ago

@davidalger Please read carefully provided link: image

It's works just because it's invalidated and reindexed in background by cron job. There are no mview processor for this indexer just dummy.

Upgrade call reindexAll in case when indexer is invalid, that a rare case for production. In case when invalidation was performed by some setup script (during adding some attribute for example) reindex is should be performed. But not every setup upgrade. Reindex itself in this indexer is very tricky - for example it's creates related database tables and cannot be replaced with just invalidation. But if you know cases when current solution may be improved - please create related PR.

This fix also delivered to 2.3-develop and will be part of the 2.3.6 release

davidalger commented 4 years ago

@o-iegorov Very interesting. I missed that note on the page. Thanks for the followup. Also g2k regarding 2.3.6 release. 👍

dan-ding commented 4 years ago

2.3.6?! October?

MichaelThessel commented 4 years ago

Too bad that it takes 6 months for a confirmed and fixed issue to be released. Not to speak of the 17 months it took to actually fix it. In case you use composer this is a quick way to patch your install.

cd vendor/magento/module-customer
curl -S https://github.com/magento/magento2/commit/0fd8a5146cdf4e524150e68f89085d90f0d42be3.diff | patch -p5 
curl -S https://github.com/magento/magento2/commit/436d0ae410101e526ac9326483788153de507f26.diff | patch -p5 
vbuck commented 4 years ago

@MichaelThessel your solution makes it appear as though you are committing the vendor directory on your VCS, or else applying these as part of a deployment pipeline step.

I do agree that the lack of prioritization on this problem is a shame. If we didn't have a way to publish diffs on GitHub publicly maybe that would have forced an earlier release of the fix.

Anyway, for those who don't commit vendor to VCS (like me), I would suggest converting Michael's steps to fit your specifications; ie:

Alternative Patch Method

  1. Fetch diffs as per commits described here: https://github.com/magento/magento2/issues/19469#issuecomment-596142891
  2. Commit as patch files on your VCS
  3. If using Composer, follow cweagans method as per: https://devdocs.magento.com/guides/v2.3/comp-mgr/patching.html
  4. If using Magento Cloud, place patch files into m2-hotfixes, as per: https://devdocs.magento.com/cloud/project/project-patch.html
MichaelThessel commented 4 years ago

@vbuck Thanks for pointing this out. I wasn't aware of the possibility to patch with composer. I went down the route you suggested and it works great. In case someone wants to implement this and has their Magento core modules in vendor here is the patch with the paths corrected:

https://gist.github.com/MichaelThessel/0b0cf69dd20326491115413adf7a94b9