magento / magento2

Prior to making any Submission(s), you must sign an Adobe Contributor License Agreement, available here at: https://opensource.adobe.com/cla.html. All Submissions you make to Adobe Inc. and its affiliates, assigns and subsidiaries (collectively “Adobe”) are subject to the terms of the Adobe Contributor License Agreement.
http://www.magento.com
Open Software License 3.0
11.55k stars 9.32k forks source link

OrderRepositoryInterface memory leak when used in foreach loops #34851

Open ioweb-gr opened 2 years ago

ioweb-gr commented 2 years ago

Preconditions (*)

  1. 2.4.1 - 2.4.3-p1
  2. A very large collection of orders / shipments

Steps to reproduce (*)

I'm trying to process all orders related to the stored shipments to do some data clean up when I stumbled into this memory leak. Basically on each repository->get() action the memory increases and is never released.

Check the following sample code using a single OrderRepositoryInterface which simply iterates a large collection of shipments found by the repository's getList function and tries to retrieve the order by id in a CLI command.

As you can see there's no reference inside the code that is keeping the object from being collected and the memory should presumably be released on every gc cycle for all the objects inside the foreach loop.

```php _state = $state; $this->searchCriteriaBuilderFactory = $searchCriteriaBuilderFactory; $this->invoiceRepository = $invoiceRepository; $this->creditmemoRepository = $creditmemoRepository; $this->shipmentRepository = $shipmentRepository; $this->orderRepository = $orderRepository; $this->orderRepositoryFactory = $orderRepositoryFactory; $this->orderFactory = $orderFactory; parent::__construct(); } /** * Configure CLI command */ protected function configure() { $this->setName('ioweb:documents:delete-unlinked'); $this->setDescription('Deletes shipments/invoices/credit memos that do not have an order id linked in the database'); parent::configure(); } /** * @param InputInterface $input * @param OutputInterface $output */ protected function execute(InputInterface $input, OutputInterface $output) { $this->_state->setAreaCode('adminhtml'); $pageSize = 50; $output->writeln("Fetching shipments with repo interface"); /** @var SearchCriteriaBuilder $shipmentBuilder */ $shipmentBuilder = $this->searchCriteriaBuilderFactory->create(); $shipmentsList = $this->shipmentRepository->getList($shipmentBuilder->create()); $shipments = $shipmentsList->getItems(); foreach ($shipments as $index => $shipment) { $orderId = $shipment->getOrderId(); $order = $this->orderRepository->get($orderId); unset($order, $shipment); if ($index % 1000 === 0) { $output->writeln("collecting cycles"); $output->writeln($this->convert(memory_get_usage())); gc_collect_cycles(); $output->writeln($this->convert(memory_get_usage())); } } $output->writeln("End deletion"); } private function convert($size) { $unit = array('b', 'kb', 'mb', 'gb', 'tb', 'pb'); return @round($size / pow(1024, ($i = floor(log($size, 1024)))), 2) . ' ' . $unit[$i]; } private function getOrder($orderId) { $repo = $this->orderRepositoryFactory->create(); return $repo->get($orderId); } } ```

On the contrary the memory is never released as the output says every 1000 items processed. I end up with this result

Fetching shipments with repo factory
collecting cycles
1.28 gb
1.28 gb
collecting cycles
1.67 gb
1.67 gb
collecting cycles
2.08 gb
2.08 gb
collecting cycles
2.48 gb
2.48 gb
collecting cycles
2.87 gb
2.87 gb
collecting cycles
3.28 gb
3.28 gb
collecting cycles
3.68 gb
3.68 gb
PHP Fatal error:  Allowed memory size of 4294967296 bytes exhausted (tried to allocate 28672 bytes) in /httpdocs/vendor/magento/framework/DB/Statement/Pdo/Mysql.php on line 91

This lead me to think that the repository itself is keeping a weak reference or a hidden reference to the object thus it cannot be released even if I unset it on my code.

Thus I changed the way of the code to use a repository factory instead

```php _state = $state; $this->searchCriteriaBuilderFactory = $searchCriteriaBuilderFactory; $this->invoiceRepository = $invoiceRepository; $this->creditmemoRepository = $creditmemoRepository; $this->shipmentRepository = $shipmentRepository; $this->orderRepository = $orderRepository; $this->orderRepositoryFactory = $orderRepositoryFactory; $this->orderFactory = $orderFactory; parent::__construct(); } /** * Configure CLI command */ protected function configure() { $this->setName('ioweb:documents:delete-unlinked'); $this->setDescription('Deletes shipments/invoices/credit memos that do not have an order id linked in the database'); parent::configure(); } /** * @param InputInterface $input * @param OutputInterface $output */ protected function execute(InputInterface $input, OutputInterface $output) { $this->_state->setAreaCode('adminhtml'); $output->writeln("Start deleting invoices/shipments and credit memos"); $output->writeln("Fetching shipments with repo factory"); /** @var SearchCriteriaBuilder $shipmentBuilder */ $shipmentBuilder = $this->searchCriteriaBuilderFactory->create(); $shipmentsList = $this->shipmentRepository->getList($shipmentBuilder->create()); $shipments = $shipmentsList->getItems(); foreach ($shipments as $index => $shipment) { $orderId = $shipment->getOrderId(); $order = $this->getOrder($orderId); unset($order, $shipment); if ($index % 1000 === 0) { $output->writeln("collecting cycles"); $output->writeln($this->convert(memory_get_usage())); gc_collect_cycles(); $output->writeln($this->convert(memory_get_usage())); } } $output->writeln("End deletion"); } private function convert($size) { $unit = array('b', 'kb', 'mb', 'gb', 'tb', 'pb'); return @round($size / pow(1024, ($i = floor(log($size, 1024)))), 2) . ' ' . $unit[$i]; } private function getOrder($orderId) { $repo = $this->orderRepositoryFactory->create(); return $repo->get($orderId); } } ```

Interestingly enough I saw that memory was decreasing on each gc cycle.

Fetching shipments with repo factory
collecting cycles
1.25 gb
1.18 gb
collecting cycles
1.57 gb
1.35 gb
collecting cycles
1.75 gb
1.52 gb
collecting cycles
1.93 gb
1.69 gb
collecting cycles
2.09 gb
1.86 gb
collecting cycles
2.26 gb
2.03 gb
collecting cycles
2.44 gb
2.21 gb
collecting cycles
2.61 gb
2.38 gb
collecting cycles
2.8 gb
2.56 gb
collecting cycles
2.96 gb
2.73 gb
collecting cycles
3.13 gb
2.9 gb
collecting cycles
3.31 gb
3.08 gb
collecting cycles
3.48 gb
3.25 gb
collecting cycles
3.66 gb
3.43 gb
PHP Fatal error:  Allowed memory size of 4294967296 bytes exhausted (tried to allocate 28672 bytes) in /httpdocs/vendor/magento/framework/DB/Statement/Pdo/Mysql.php on line 91

However it seems that it's not entirely released. The footprint is still pretty big but even so using the factory and creating a new instance each time leads to better memory management than using a single repository instance. This doesn't seem right.

Expected result (*)

  1. Memory is released when references are destroyed in our code.
  2. I can use foreach normally and let gc do it's work when using repositories in a foreach statement.

Actual result (*)

  1. The repositories and factories are keeping references we don't have access to preventing the unused objects to be collected in the next gc cycle.
  2. Repositories are unusable in foreach statements for large SearchResultsInterface sets

It seems the only way to bypass this is to process each item in external scripts so that when they end up execution they will forcefully release all memory. This is not a solution thogh.


Please provide Severity assessment for the Issue as Reporter. This information will help during Confirmation and Issue triage processes.

m2-assistant[bot] commented 2 years ago

Hi @ioweb-gr. Thank you for your report. To speed up processing of this issue, make sure that you provided the following information:

Make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, Add a comment to the issue:

@magento give me 2.4-develop instance - upcoming 2.4.x release

For more details, review the Magento Contributor Assistant documentation.

Add a comment to assign the issue: @magento I am working on this

To learn more about issue processing workflow, refer to the Code Contributions.


:clock10: You can find the schedule on the Magento Community Calendar page.

:telephone_receiver: The triage of issues happens in the queue order. If you want to speed up the delivery of your contribution, join the Community Contributions Triage session to discuss the appropriate ticket.

:movie_camera: You can find the recording of the previous Community Contributions Triage on the Magento Youtube Channel

:pencil2: Feel free to post questions/proposals/feedback related to the Community Contributions Triage process to the corresponding Slack Channel

ioweb-gr commented 2 years ago

Some interesting facts by memprof

image

It seems the plugins in GiftMessage are responsible for a huge increase in memory consumption although it doesn't explain the object not being able to release memory just for using lots of memory per call.

image

Does the gift message introduce any circular reference with the order?

m2-assistant[bot] commented 2 years ago

Hi @engcom-Hotel. Thank you for working on this issue. In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:

LucScu commented 2 years ago

@ioweb-gr what do you think is the best way to avoid this issue? Using model factory or collection factory?

ioweb-gr commented 2 years ago

I haven't found a way to avoid the issue unfortunately. I can't track where the object that can't be released from memory is instantiated and stored

pczerkas commented 2 years ago

@ioweb-gr these two functions could be a culprit: https://github.com/magento/magento2/blob/302f6c52138221d511b0e1742f94cd43f68aa6f3/lib/internal/Magento/Framework/Model/EntitySnapshot.php#L54 https://github.com/magento/magento2/blob/302f6c52138221d511b0e1742f94cd43f68aa6f3/lib/internal/Magento/Framework/Model/ResourceModel/Db/VersionControl/Snapshot.php#L43

Try with commented out bodies of these two functions.

ioweb-gr commented 2 years ago

Much smaller footprint but the issue remains with those functions commented out.

Fetching shipments with repo interface
collecting cycles
837.19 mb
837.19 mb
collecting cycles
1.15 gb
1.15 gb
collecting cycles
1.48 gb
1.48 gb
collecting cycles
1.82 gb
1.82 gb
collecting cycles
2.15 gb
2.15 gb
collecting cycles
2.48 gb
2.48 gb
collecting cycles
2.82 gb
2.82 gb
collecting cycles
3.16 gb
3.16 gb
collecting cycles
3.5 gb
3.5 gb
collecting cycles
3.84 gb
3.84 gb
PHP Fatal error:  Allowed memory size of 4294967296 bytes exhausted (tried to allocate 28672 bytes) in /httpdocs/vendor/magento/framework/DB/Statement/Pdo/Mysql.php on line 91
pczerkas commented 2 years ago

Out of curiosity, could you check how it behaves with https://www.php.net/manual/en/info.configuration.php#ini.zend.enable-gc disabled?

ioweb-gr commented 2 years ago

Sure, made no difference

Fetching shipments with repo interface
collecting cycles
837.19 mb
837.19 mb
collecting cycles
1.15 gb
1.15 gb
collecting cycles
1.48 gb
1.48 gb
collecting cycles
1.82 gb
1.82 gb
collecting cycles
2.15 gb
2.15 gb
collecting cycles
2.48 gb
2.48 gb
collecting cycles
2.82 gb
2.82 gb
collecting cycles
3.16 gb
3.16 gb
collecting cycles
3.5 gb
3.5 gb
collecting cycles
3.84 gb
3.84 gb
PHP Fatal error:  Allowed memory size of 4294967296 bytes exhausted (tried to allocate 28672 bytes) in /httpdocs/vendor/magento/framework/DB/Statement/Pdo/Mysql.php on line 91
stefantoczek commented 2 years ago

@ioweb-gr these two functions could be a culprit: https://github.com/magento/magento2/blob/302f6c52138221d511b0e1742f94cd43f68aa6f3/lib/internal/Magento/Framework/Model/ResourceModel/Db/VersionControl/Snapshot.php#L43

Try with commented out bodies of these two functions.

recently I came across similar issue and found out that the 2nd class you mentioned is causing it and adding return; as first line in methods body fixes it as basically this method saves every loaded entity into lookup table which without force emptying it won't empty at all until request / command ends; I haven't found any elegant way for running code iterating over huge collections except for dirty plugin way:

class SnapshotPlugin
{
    public function aroundRegisterSnapshot(
        \Magento\Framework\Model\ResourceModel\Db\VersionControl\Snapshot $subject,
        callable $proceed,
        \Magento\Framework\DataObject $entity
    ) {
        if(!$this->config->disableSnapshot){
            return $proceed($entity);
        }
    }
}

basically, when you need to run huge export command then set flag inside config class & you're done ¯_(ツ)_/¯

ioweb-gr commented 2 years ago

Well in my case I tried it with those two function bodies removed and the issue persisted. Not sure why.

stefantoczek commented 2 years ago

Are you sure the issue isn't caused by 3rd party module? I can see amasty in profiler output, have you checked with those modules removed?

W dniu wt., 18.01.2022 o 17:08 IOWEB TECHNOLOGIES @.***> napisał(a):

Well in my case I tried it with those two function bodies removed and the issue persisted. Not sure why.

— Reply to this email directly, view it on GitHub https://github.com/magento/magento2/issues/34851#issuecomment-1015563948, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMCF2IPJ6Z4BLK6CQQAAETUWWGAVANCNFSM5KNWW2IA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

julien-gillot commented 2 years ago

I think there is a problem at this level, I see the same thing on our setup (magento 2.4.3-p1, 140k+ orders) So I hope with my comment to bring my stone to the building

Example with an indexer from a third-party module that I recently modified to avoid the problem! I disabled all third-party modules to keep only the bare minimum for testing

Batch size: 10k

without comment registerSnapshot method and without other fix

bin/magento indexer:reset amasty_order_export_attribute_index && bin/magento indexer:reindex amasty_order_export_attribute_index
Amasty: Order Export - Product Attributes indexer has been invalidated.
Amasty: Order Export - Product Attributes index
top | grep "\bphp\b"
1695091 user+  20   0  595852 525936  32080 R 100,0   0,8   0:01.26 php
1695091 user+  20   0 1077900 888956  32080 S  63,8   1,4   0:03.18 php
1695091 user+  20   0 1662092   1,2g  32080 S  75,0   1,9   0:05.43 php
1695091 user+  20   0 2196620   1,6g  32080 R  69,8   2,5   0:07.53 php
1695091 user+  20   0 2655884   1,9g  32080 S  60,3   3,0   0:09.35 php
1695091 user+  20   0 3222156   2,2g  32080 R  61,1   3,5   0:11.19 php
1695091 user+  20   0 3658380   2,5g  32080 S  61,5   4,0   0:13.04 php
1695091 user+  20   0 4057740   2,7g  32080 S  56,5   4,4   0:14.74 php
over 4G > killed

with memory_limit = -1 > around 8G and 20min to rebuilt successfully

now with comment registerSnapshot method and without other fix

bin/magento indexer:reset amasty_order_export_attribute_index && bin/magento indexer:reindex amasty_order_export_attribute_index
Amasty: Order Export - Product Attributes indexer has been invalidated.
Amasty: Order Export - Product Attributes index has been rebuilt successfully in 00:00:57
top | grep "\bphp\b"
1695243 user+  20   0  182216 108128  28980 R  61,7   0,2   0:01.85 php
1695289 user+  20   0  448404 324956  32160 R  28,5   0,5   0:00.86 php
1695289 user+  20   0  491668 360468  32160 R  53,8   0,5   0:02.48 php
1695289 user+  20   0  495764 374136  32160 R  73,4   0,6   0:04.69 php
1695289 user+  20   0  501908 386212  32160 R  64,2   0,6   0:06.63 php
1695289 user+  20   0  499860 388616  32160 R  58,1   0,6   0:08.38 php
1695289 user+  20   0  516456 400936  32160 R  55,1   0,6   0:10.04 php
1695289 user+  20   0  501908 403784  32160 S  63,9   0,6   0:11.97 php
1695289 user+  20   0  503956 408364  32160 S  43,4   0,6   0:13.28 php
1695289 user+  20   0  503956 412064  32160 R  51,8   0,6   0:14.84 php
1695289 user+  20   0  506004 414380  32160 S  53,5   0,6   0:16.46 php
1695289 user+  20   0  510100 419660  32160 S  46,2   0,6   0:17.85 php
1695289 user+  20   0  518292 427504  32160 S  43,5   0,7   0:19.16 php
1695289 user+  20   0  530792 441312  32160 R  49,8   0,7   0:20.66 php
1695289 user+  20   0  530792 441528  32160 R  51,8   0,7   0:22.22 php
1695289 user+  20   0  530792 442604  32160 R  49,0   0,7   0:23.70 php
1695289 user+  20   0  518292 430960  32160 S  48,5   0,7   0:25.16 php
1695289 user+  20   0  518292 431584  32160 S  52,0   0,7   0:26.73 php
1695289 user+  20   0  518292 431764  32160 S  51,5   0,7   0:28.28 php
1695289 user+  20   0  518292 432072  32160 R  43,5   0,7   0:29.59 php

now with comment registerSnapshot method and other fix

bin/magento indexer:reset amasty_order_export_attribute_index && bin/magento indexer:reindex amasty_order_export_attribute_index
Amasty: Order Export - Product Attributes indexer has been invalidated.
Amasty: Order Export - Product Attributes index has been rebuilt successfully in 00:00:36
top | grep "\bphp\b"
1696323 user+  20   0  317332 247244  32180 S   6,2   0,4   0:00.61 php
1696323 user+  20   0  336020 266068  32180 S  30,2   0,4   0:01.52 php
1696323 user+  20   0  338068 268128  32180 S  37,5   0,4   0:02.65 php
1696323 user+  20   0  340116 270180  32180 S  28,2   0,4   0:03.50 php
1696323 user+  20   0  342164 272232  32180 S  34,2   0,4   0:04.53 php
1696323 user+  20   0  352616 282472  32180 R  24,2   0,4   0:05.26 php
1696323 user+  20   0  340116 270184  32180 S  30,6   0,4   0:06.18 php
1696323 user+  20   0  342164 272232  32180 S  26,5   0,4   0:06.98 php
1696323 user+  20   0  344212 274280  32180 S  25,6   0,4   0:07.75 php
1696323 user+  20   0  342164 272232  32180 R  17,0   0,4   0:08.26 php
1696323 user+  20   0  340116 270184  32180 S  21,6   0,4   0:08.91 php
1696323 user+  20   0  344212 274280  32180 S  14,6   0,4   0:09.35 php

I think there is still an accumulation of memory even after that. However the amount to decrease so much that I stopped the track here.

The registerSnapshot method does not seem to be the problem although being able to disable this function on demand would be a plus for memory and execution time.

Here are resume the modifications made to the script to be able to achieve this result:

default:

$collection = $this->orderItemCollectionFactory->create();
...
if ($page <= $collection->getLastPageNumber()) {

    $result = [];
    foreach ($collection->getItems() as $orderItem) {
        $result[$orderItem->getStoreId()][] = $orderItem;
    }

    foreach ($result as $storeId => $orderItems) {
        ...
    }
}
...

rewrite:

$collection = $this->orderItemCollectionFactory->create();
...
if ($page <= $collection->getLastPageNumber()) {
    $collection->getSelect()->limitPage($page, $pageSize);
    $query = $collection->getSelect()->__toString();
    $tmp = $connection->fetchAll($query); // to skip magento class

    $result = [];
    foreach ($collection->getItems() as $_orderItem) {
        $orderItem = clone $this->item; // \Magento\Sales\Model\Order\Item
        $orderItem->setData($_orderItem);
        $result[$orderItem->getStoreId()][] = $orderItem;
    }

    foreach ($result as $storeId => $orderItems) {
        ...
    }
}
...

I hope this will help

Translate by google

bradleybrecher commented 2 years ago

@julien-gillot I am facing the same issue but not fully understanding your solution. Do you mind elaborating?

HenKun commented 2 years ago

In OrderRepository each order fetched by the get() method is stored in a variable registry. This is to ensure consistency of order objects throughout the code base, e.g. when they are fetched in different places and manipulated in those places. Additionally it saves database requests, since the order is returned directly from memory. However, the getList method does not use the registry. This is better for ressource management, when large batches of orders are fetched and not needed afterwards. On the other hand there might be inconsistencies if you are not cautious. So you can use getList and filter by the entity ID, instead of using get

engcom-Hotel commented 2 years ago

Hello @ioweb-gr,

Thanks for the report and collaboration!

We have tried to reproduce the issue with a small set of shipments from our instance and the issue seems reproducible if we have so many shipments.

Please have a look at the below screenshot for reference:

image

Hence confirming the issue.

Thanks

github-jira-sync-bot commented 2 years ago

:white_check_mark: Jira issue https://jira.corp.adobe.com/browse/AC-6059 is successfully created for this GitHub issue.

m2-assistant[bot] commented 2 years ago

:white_check_mark: Confirmed by @engcom-Hotel. Thank you for verifying the issue.
Issue Available: @engcom-Hotel, You will be automatically unassigned. Contributors/Maintainers can claim this issue to continue. To reclaim and continue work, reassign the ticket to yourself.

ioweb-gr commented 9 months ago

@engcom-Hotel since you verified this issue could you also take a look at #38418 which is closely related?

magento-deployment-service[bot] commented 8 months ago

Hi @gitevildelta. Thank you for your request. I'm working on Magento instance for you.

magento-deployment-service[bot] commented 8 months ago

Hi @gitevildelta, unfortunately there is no ability to deploy Magento instance at the moment. Please try again later.