Closed leonhelmus closed 3 years ago
Having only spent a few minutes looking at the code I see this:
vendor/emico/tweakwise-export/src/Model/Write/Products.php:131
=> XML will be flushed (written to file) every 100 iterationsvendor/emico/tweakwise-export/src/Model/Write/EavIterator.php:288
=> iterator batch size is 5000100 * 5 000 = 500 000
products.~~In other words, it seems that it will never write the XML to file until ALL products have been added to the XML in memory? So that's 29 595 products * 1 206 attributes = 35 691 570 XML entries
in memory?~
Even if this first impression is correct (since it's very possible my quick assessment may be completely wrong), another factor to consider is whether flushing the XML will actually free up any memory, since the XML object itself will still be available in memory after flushing and will probably still have all items in it?
UPDATE: on second review, the generators are being nested so each iteration is a product and not a batch, so the multiplication above is not correct. The question of whether whether flushing the XML writer will actually free up memory or not might still be something to consider though.
Ok so what's happening is 5000 items are loaded fully in memory (6+ million items in memory) and then 100 of them are flushed to disk at a time. I think this alone might still be enough to explain the memory consumption we're seeing. We'll test reducing batch size from 5000 to 500 as a PoC of what happens to the memory vs speed tradeoff in that case.
Changing the setting to 500 items reduced memory consumption to 1.2GB which is more manageable, so this seems like the right way forward. This ticket can be closed.
What is the purpose of this issue? Explain the background context. For our development/production environment we try to create a tweakwise:export for our client, but it seems that the feed takes a lot of resources to be generated. When running the tweakwise export on our acceptance environment (that has same stores/categories/products) it takes 2463.98s using 5695.77Mb memory. I would like to know if we can split these exports by store or page? This would help in resource management of our server.
Environment
Steps to reproduce
Actual result
The feed generated takes 2463.98s using 5695.77Mb memory to create. The tweakwise.xml is 1.6G big. How can the performance of generating a feed be improved? ...
Expected result Create multiple feeds per store or add a max page per feed in which all feeds will be smaller.