magento / magento2

Prior to making any Submission(s), you must sign an Adobe Contributor License Agreement, available here at: https://opensource.adobe.com/cla.html. All Submissions you make to Adobe Inc. and its affiliates, assigns and subsidiaries (collectively “Adobe”) are subject to the terms of the Adobe Contributor License Agreement.
http://www.magento.com
Open Software License 3.0
11.56k stars 9.32k forks source link

ElasticSearch Aggregation Bucket Size Parameter Not Passed #11277

Closed dfelton closed 7 years ago

dfelton commented 7 years ago

Elastic Search defaults its bucket size to 10 if not specified in the query. Magento\Elasticsearch\SearchAdapter\Query\Builder\Aggregation::buildBucket() omits this parameter. Because of this, it is possible for ElastiSearch to omit results if more there are more than 10 possible results. This drastically impacts the Layered Navigation block in a negative way if any of the attribute filters are supposed to have more than 10 options to filter by.

According to all ElasticSearch 2.0, 2.1, 2.2, 2.3, and 2.4, all of these Magento supported versions default size to 10. Quoted from their documentation page:

By default, the terms aggregation will return the buckets for the top ten terms ordered by the doc_count. One can change this default behaviour by setting the size parameter.

Preconditions

  1. Magento 2.1.8 EE
  2. ElasticSearch 2.4

Steps to reproduce

  1. Install Magento
  2. Install and configure ElasticSearch 2.4
  3. In Magento admin panel configure Magento to utilize ElasticSearch instead of the default MySQL engine
  4. Add more than to options to "Color" catalog_product attribute (or any product EAV attribute with drop-down or multi-select options). Additionally, make sure the attribute is set to "Filterable (with results)".
  5. Create more than 10 simple products, assigning each product to one of the newly created attribute option values (each product should be unique, and not utilize the same option as another, so that we can ensure more than 10 filterable options on the front end.
  6. Create a catalog category, assigning all newly created products from Step 5 to this category. Make sure category is set to be an "Anchor" category so that it's front end display contains the Layered Navigation block.
  7. Perform a full reindex so as to ensure all catalog information is propagated up into ElasticSearch
  8. Navigate to the newly created category page on the front end.

Expected result

  1. All possible "Color" options should appear in the Layered Navigation.

Actual result

  1. Only up to 10 "Color" options will appear in the Layered Navigation.

Our Solution

We've extended Magento\Elasticsearch\SearchAdapter\Query\Builder\Aggregation, overriding the buildBucket method to add a hard coded "size" parameter with a value of 2000, which was suitable for our needs but others will have to determine their own appropriate value for this. ElasticSearch's documentation states that using "0" is deprecated as of 2.4, but I can confirm that it does still work.

Source Code

(Assumes you already know how to build a basic module, I've only included what was added to an existing module in order to patch this bug)

app/code/VendorName/ModuleName/etc/di.xml

Add to your configuration:

<preference for="Magento\Elasticsearch\SearchAdapter\Query\Builder\Aggregation" type="VendorName\ModuleName\Magento\Elasticsearch\SearchAdapter\Query\Builder\Aggregation"/>

app/code/VendorName/ModuleName/Magento/Elasticsearch/SearchAdapter/Query/Builder/Aggregation.php

<?php

namespace VendorName\ModuleName\Magento\Elasticsearch\SearchAdapter\Query\Builder;

use Magento\Framework\Search\Request\BucketInterface;

class Aggregation extends \Magento\Elasticsearch\SearchAdapter\Query\Builder\Aggregation
{
    /**
     * Build aggregation query for bucket
     *
     * Modified from core to provide "size" parameter for "terms".
     *
     * @param array $searchQuery
     * @param BucketInterface $bucket
     * @return void
     */
    protected function buildBucket(
        array $searchQuery,
        BucketInterface $bucket
    ) {
        $field = $this->fieldMapper->getFieldName($bucket->getField());
        switch ($bucket->getType()) {
            case BucketInterface::TYPE_TERM:
                $searchQuery['body']['aggregations'][$bucket->getName()]= [
                    'terms' => [
                        'size' => 2000, // <-- ONLY NEW LINE ADDED TO METHOD
                        'field' => $field,
                    ],
                ];
                break;
            case BucketInterface::TYPE_DYNAMIC:
                $searchQuery['body']['aggregations'][$bucket->getName()]= [
                    'extended_stats' => [
                        'field' => $field,
                    ],
                ];
                break;
        }
        return $searchQuery;
    }
}

Before Picture (max of 10 filter options per attribute show)

before

After Picture (all available filters now show)

after

okorshenko commented 7 years ago

Hi @dfelton looks like you already have a solution. Would like to submit PR?

dfelton commented 7 years ago

@okorshenko Yes I could certainly do that.

Would the Magento team prefer to use a value of 0 (deprecated in ElasticSearch 2.4, but still works), or a hard set large value for the bucket size?

dfelton commented 7 years ago

@okorshenko

Actually I just noticed that ElasticSearch doesn't seem to be included in the Community Edition of Magento. Therefore I wouldn't know how to create a pull request for this particular issue as it is a EE specific problem.

kervin commented 7 years ago

Thanks dfelton!

magento-engcom-team commented 7 years ago

@dfelton, thank you for your report. Please follow these guidelines for proper tracking of your issue. You can report Commerce-related issues in one of two ways: You can use the Support portal associated with your account or If you are a Partner reporting on behalf of a merchant, use the Partner portal.

GitHub is intended for Magento Open Source users to report on issues related to Open Source only. There are no account management services associated with GitHub.

dfelton commented 7 years ago

@magento-engcom-team

Thank you I will communicate this issue and my findings through the Support Portal.

Does Magento have a forum or some other medium that it would be appropriate to report and discuss Enterprise specific issues in a public context? Sometimes can be difficult to find community driven discussion around Enterprise specific modules.

I am going to leave the GitHub issue open for exposure purposes but if you guys prefer it be closed I understand. Thank you.

okorshenko commented 7 years ago

Hi @dfelton you can visit the Forum or Community Slack magentocommeng.slack.com to discuss any Magento related questions