la-haute-societe / craft-elasticsearch

Bring the power of Elasticsearch to your Craft CMS projects
Other
18 stars 14 forks source link
cms craft-plugin craftcms elasticsearch

Elastic

Elasticsearch plugin for Craft CMS

Bring the power of Elasticsearch to your Craft CMS projects.

Plugin screenshot

Requirements

This plugin works with Craft CMS 3 or 4.

In order to index data, you will need an Elasticsearch 6.0 (or later) instance, with the Ingest attachment processor plugin activated.

Installation

The easy way

Just install the plugin from the Craft Plugin Store.

Using Composer

Elasticsearch plugin Overview

Elasticsearch plugin will automatically index each entry and Craft Commerce product (if installed) on your site(s).

It will figure out the best Elasticsearch mapping for you based on your site(s)' language.

Supported languages

Craft language Elastic analyzer Notes
ar arabic
hy armenian
eu basque
bn bengali
pt-BR brazilian
bg bulgarian
ca catalan
zh cjk
ja cjk
ko cjk
cs czech
da danish
nl dutch
en english
fi finnish
fr french
gl galician
de german
el greek
hi hindi
hu hungarian
id indonesian
ga irish
it italian
lv latvian
lt lithuanian
nb norwegian
fa persian
pt portuguese
ro romanian
ru russian
uk ukrainian analysis-ukrainian plugin needed
es spanish
pl polish analysis-stempel plugin needed
sv swedish
tr turkish
th thai

Configuring the Elasticsearch plugin

You can configure the Elasticsearch plugin from the Craft Control Panel (some settings only), of from the config/elasticsearch.php file in your Craft installation (all settings). If a setting is defined both in the CP and in the configuration file, the latter takes precedence.

The src/config.php, file is a configuration template to be copied to config/elasticsearch.php.

💡 If you currently don't have an Elasticsearch server handy, here's how you can set one up.

In both the configuration file and the CP

elasticsearchEndpoint

Type: string

The Elasticsearch instance endpoint URL (with protocol, host and port). Ignored if elasticsearchComponentConfig is set.

isAuthEnabled

Type: bool

A boolean indicating whether authentication is required on the Elasticsearch instance. Ignored if elasticsearchComponentConfig is set.

username

Type: string

The username used to authenticate on the Elasticsearch instance if it's protected by X-Pack Security. Ignored if isAuthEnabled is set to false or elasticsearchComponentConfig is set.

password

Type: string

The password used to authenticate on the Elasticsearch instance if it's protected by X-Pack Security. Ignored if isAuthEnabled is set to false or elasticsearchComponentConfig is set.

indexNamePrefix

Type: string

Index name prefix used to avoid index name collision when using a single Elasticsearch instance with several Craft instances.
Up to 5 characters, all lowercase.

highlight

Type: array

The elasticsearch configuration used to highlight query results. Only pre_tags and post_tags are configurable in the CP, advanced config must be done in the file. For more options, refer to the elasticsearch documentation.

blacklistedEntryTypes

Type: string[]

An array of entry type handles. Entries of those types won't be indexed.

blacklistedAssetVolumes

Type: string[]

An array of asset volume handles. Assets in those volumes won't be indexed.

Only in the configuration file

contentExtractorCallback

Type: callable

A callback function (function(string $entryContent): string) used to extract the content to be indexed from the full HTML source of the entry's page.

The default is to extract the HTML code between those 2 comments: <!-- BEGIN elasticsearch indexed content --> and <!-- END elasticsearch indexed content -->.

elementContentCallback

Type: callable

A callback function (function (\craft\base\ElementInterface $element): string) used to get the HTML content for the given element to index.

Note:

  • If this parameter is not set or null, the default Guzzle client implementation will be used to get the HTML content of the element. If set, you will have to handle that part yourself.
  • Content should be returned as HTML content in order to be correctly indexed.

resultFormatterCallback

Type: callable

A callback function (function (array $formattedResult, $result): array) used to prepare and format the Elasticsearch result object in order to be used by the results twig view.

elasticsearchComponentConfig

Type: array

An associative array passed to the yii2-elasticsearch component Connection class constructor. All public properties of the yii2-elasticsearch component Connection class can be set. If this is set, the elasticsearchEndpoint, username, password and isAuthEnabled settings will be ignored.

extraFields

Type: array

An associative array allowing to declare additional fields to be indexed along with the defaults ones. See Index additional data for more details.

Indexable content

By default, the content indexed in each entry is between the <!-- BEGIN elasticsearch indexed content --> and <!-- END elasticsearch indexed content --> HTML comments in the source of the entry page.

If you're using semantic HTML in your templates, then putting your <main> or <article> element between those comments should be ideal.

If you need more control over what is indexed, you'll have to set up a custom contentExtractorCallback.

Running a search

The search feature can be used from a frontend template file by calling the craft.elasticsearch.search('Something to search') variable. For instance, in a template search/index.twig:

{% set results = craft.elasticsearch.search(craft.app.request.get('q')) %}

{% block content %}
    <h1>{{ "Search"|t }}</h1>

    <form action="{{ url('search') }}">
        <input type="search" name="q" placeholder="Search" value="{{ craft.app.request.get('q') }}">
        <input type="submit" value="Go">
    </form>

    {% if results|length %}
        <h2>{{ "Results"|t }}</h2>

        {% for result in results %}
            <h3>{{ result.title }}</h3>
            <p>
                <small><a href="https://github.com/la-haute-societe/craft-elasticsearch/blob/master/{{ result.url|raw }}">{{ result.url }}</a><br/>
                    {% if result.highlights|length %}
                        {% for highlight in result.highlights %}
                            {{ highlight|raw }}<br/>
                        {% endfor %}
                    {% endif %}
                </small>
            </p>
            <hr>
        {% endfor %}
    {% else %}
        {% if craft.app.request.get('q') is not null %}
            <p>
                <em>{{ "No results"|t }}</em>
            </p>
        {% endif %}
    {% endif %}
{% endblock %}

Each entry consists of the following attributes:

Notes:

Auto indexing

The plugin automatically indexes entries and Craft Commerce products (created, updated or removed), as long as they're not in a blacklisted entry types, or disabled.

All entries are reindexed (in the background) when plugin settings are saved.

Elasticsearch plugin utilities

If your Elasticsearch index becomes out of sync with your sites contents, you can go to Utilities → Elasticsearch then click the Reindex all button.

Elasticsearch plugin console commands

The plugin provides an extension to the Craft console command that lets you reindex all entries or recreate empty indexes.

Recreate empty indexes

Remove index & create an empty one for all sites

./craft elasticsearch/elasticsearch/recreate-empty-indexes

Reindex all sites

./craft elasticsearch/elasticsearch/reindex-all

Notes:

  • The command will probably fail in case you don't affect a specific domain to a given site, for instance, avoid to use @web as a base URL.

Indexing of additional data

Simple way using the configuration file

To easily index additional data (elasticsearch fields), you can declare them using the extraFields parameter in the plugin configuration file.

Each field should be declared by using associative array with keys representing fields names and value as an associative array to configure the field behavior:

For example, to declare a color field in the configuration file, one could do:

...
  'extraFields'              => [
    'color' => [
        'mapping'     => [
            'type'  => 'text',
            'store' => true
        ],
        'highlighter' => (object)[],
        'value'       => function (\craft\base\ElementInterface $element, \lhs\elasticsearch\records\ElasticsearchRecord $esRecord) {
            // $esRecord->whatEverMethod();
            return ArrayHelper::getValue($element, 'color.hex');
        }
    ]
  ...

More complex way to get even more control

You can get even more control over your additional data by listening to the following events in some project module:

Tips

Enable fuzziness

As of the other search query parameters, you can set fuzziness by altering default search query as follow:

Event::on(ElasticsearchRecord::class, ElasticsearchRecord::EVENT_BEFORE_SEARCH, function (SearchEvent $event) {
    /** @var ElasticsearchRecord $esRecord */
    $esRecord = $event->sender;
    $query = $event->query;
    // Customise the query params
    $queryParams = $esRecord->getQueryParams($query);
    $queryParams['bool']['must'][0]['multi_match']['fuzziness'] = 'AUTO' // Adjust value to your needs here
    $esRecord->setQueryParams($queryParams);
});

Troubleshooting

La Haute Société

Elasticsearch is a trademark of Elasticsearch BV, registered in the U.S. and in other countries.