Open andreas-gruenwald opened 4 years ago
can you provide a PR? With comments why we are doing this that way.
I am not sure, if this solution is ready for a PR yet, as it might be fragile regarding nested aggregations, etc. We will investigate it within the project and I will create a PR as soon as there is a stable outcome.
Performance Improvement
Problem Summary
In one of our projects, the retrieval of a simple category filter takes more than 1.2 seconds. I found out that some loc and data structures for loading Elastic search aggregations can be improved. Especially the conversion process for aggregations and buckets in the ElasticSearch product list seems to consume a lot of time for large datasets (>20.000 categories in total; 20 on the first level).
Problem Details
This is the initial test code:
I digged into
ProductList\ElasticSearch\AbstractElasticSearch
anddoLoadGroupByValues
and found out that the extraction of the filter aggregations and buckets is very time intense.The subsequential code sequence took (enabled debugger) took 568ms.
Solution Concept
I used the following code to demonstrate that the retrieval could be done much faster: 32ms (vs. original 568ms).
With disabled debugger it is still 27ms vs. 94 ms!
This example is just a demonstration that arrays as data structure can cause a lot of performance overhead. The code above should be refactored carefully. In general it might be very helpful to profile the ecommerce productlists of Pimcore with higher amounts of categories/aggregations, as they turn out to be very useful to identify potential performance issues.