jolicode / elastically

πŸ” JoliCode's Elastica wrapper to bootstrap Elasticsearch PHP integrations
MIT License
249 stars 40 forks source link
elastica elasticsearch hacktoberfest php symfony

Elastically, Elastica based framework

Opinionated Elastica based framework to bootstrap PHP and Elasticsearch implementations.

Main features:

[!IMPORTANT] Require PHP 8.0+ and Elasticsearch 8+.

Works with Elasticsearch 7 as well, but is not officially supported by Elastica 8. Use with caution.

Version 2+ does not work with OpenSearch anymore due to restrictions added by Elastic on their client.

You can check the changelog and the upgrade documents.

Installation

composer require jolicode/elastically

Demo

[!TIP] If you are using Symfony, you can move to the Symfony chapter

Quick example of what the library do on top of Elastica:

// Your own DTO, or one generated by Jane (see below)
class Beer
{
    public string $foo;
    public string $bar;
}

use JoliCode\Elastically\Factory;
use JoliCode\Elastically\Model\Document;

// Factory object with Elastica options + new Elastically options in the same array
$factory = new Factory([
    // Where to find the mappings
    Factory::CONFIG_MAPPINGS_DIRECTORY => __DIR__.'/mappings',
    // What objects to find in each index
    Factory::CONFIG_INDEX_CLASS_MAPPING => [
        'beers' => Beer::class,
    ],
]);

// Class to perform request, same as the Elastica Client
$client = $factory->buildClient();

// Class to build Indexes
$indexBuilder = $factory->buildIndexBuilder();

// Create the Index in Elasticsearch
$index = $indexBuilder->createIndex('beers');

// Set the proper aliases
$indexBuilder->markAsLive($index, 'beers');

// Class to index DTO(s) in an Index
$indexer = $factory->buildIndexer();

$dto = new Beer();
$dto->bar = 'American Pale Ale';
$dto->foo = 'Hops from Alsace, France';

// Add a document to the queue
$indexer->scheduleIndex('beers', new Document('123', $dto));
$indexer->flush();

// Set parameters on the Bulk
$indexer->setBulkRequestParams([
    'pipeline' => 'covfefe',
    'refresh' => 'wait_for'
]);

// Force index refresh if needed
$indexer->refresh('beers');

// Get the Document (new!)
$results = $client->getIndex('beers')->getDocument('123');

// Get the DTO (new!)
$results = $client->getIndex('beers')->getModel('123');

// Perform a search
$results = $client->getIndex('beers')->search('alsace');

// Get the Elastic Document
$results->getDocuments()[0];

// Get the Elastica compatible Result
$results->getResults()[0];

// Get the DTO πŸŽ‰ (new!)
$results->getResults()[0]->getModel();

// Create a new version of the Index "beers"
$index = $indexBuilder->createIndex('beers');

// Slow down the Refresh Interval of the new Index to speed up indexation
$indexBuilder->slowDownRefresh($index);
$indexBuilder->speedUpRefresh($index);

// Set proper aliases
$indexBuilder->markAsLive($index, 'beers');

// Clean the old indices (close the previous one and delete the older)
$indexBuilder->purgeOldIndices('beers');

// Mapping change? Just call migrate and enjoy a full reindex (use the Task API internally to avoid timeout)
$newIndex = $indexBuilder->migrate($index);
$indexBuilder->speedUpRefresh($newIndex);
$indexBuilder->markAsLive($newIndex, 'beers');

[!NOTE] scheduleIndex is here called with "beers" index because the index was already created before. If you are creating a new index and want to index documents into it, you should pass the Index object directly.

mappings/beers_mapping.yaml

# Anything you want, no validation
settings:
    number_of_replicas: 1
    number_of_shards: 1
    refresh_interval: 60s
mappings:
    dynamic: false
    properties:
        foo:
            type: text
            analyzer: english
            fields:
                keyword:
                    type: keyword

Configuration

This library add custom configurations on top of Elastica's:

Factory::CONFIG_MAPPINGS_DIRECTORY (required with default configuration)

The directory Elastically is going to look for YAML.

When creating a foobar index, a foobar_mapping.yaml file is expected.

If an analyzers.yaml file is present, all the indices will get it.

Factory::CONFIG_INDEX_CLASS_MAPPING (required)

An array of index name to class FQN.

[
  'indexName' => My\AwesomeDTO::class,
]

Factory::CONFIG_MAPPINGS_PROVIDER

An instance of MappingProviderInterface.

If this option is not defined, the factory will fall back to YamlProvider and will use Factory::CONFIG_MAPPINGS_DIRECTORY option.

There are two providers available in Elastically: YamlProvider and PhpProvider.

Factory::CONFIG_SERIALIZER (optional)

A SerializerInterface compatible object that will be used on indexation.

Default to Symfony Serializer with Object Normalizer.

A faster alternative is to use Jane to generate plain PHP Normalizer, see below. Also, we recommend customization to handle things like Date.

Factory::CONFIG_DENORMALIZER (optional)

A DenormalizerInterface compatible object that will be used on search results to build your objects back.

If this option is not defined, the factory will fall back to Factory::CONFIG_SERIALIZER option.

Factory::CONFIG_SERIALIZER_CONTEXT_BUILDER (optional)

An instance of ContextBuilderInterface that build a serializer context from a class name.

If it is not defined, Elastically, will use a StaticContextBuilder with the configuration from Factory::CONFIG_SERIALIZER_CONTEXT_PER_CLASS.

Factory::CONFIG_SERIALIZER_CONTEXT_PER_CLASS (optional)

Allow to specify the Serializer context for normalization and denormalization.

[
    Beer::class => ['attributes' => ['title']],
];

Default to [].

Factory::CONFIG_BULK_SIZE (optional)

When running indexation of lots of documents, this setting allow you to fine-tune the number of document threshold.

Default to 100.

Factory::CONFIG_INDEX_PREFIX (optional)

Add a prefix to all indexes and aliases created via Elastically.

Default to null.

Usage in Symfony

Configuration

You'll need to add the bundle in bundles.php:

// config/bundles.php
return [
    // ...
    JoliCode\Elastically\Bridge\Symfony\ElasticallyBundle::class => ['all' => true],
];

Then configure the bundle:

# config/packages/elastically.yaml
elastically:
    connections:
        # You can create multiple clients
        default:
            # Any Elastica option works here
            client:
                hosts:
                    - '127.0.0.1:9200'

            # Path to the mapping directory (in YAML)
            mapping_directory:       '%kernel.project_dir%/config/elasticsearch'

            # Size of the bulk sent to Elasticsearch (default to 100)
            bulk_size:               100

            # Mapping between an index name and a FQCN
            index_class_mapping:
                my-foobar-index:     App\Dto\Foobar

            # Configuration for the serializer
            serializer:
                # Fill a static context
                context_mapping:
                    foo:                 bar

            # If you want to add a prefix for your index in elasticsearch (you can still call it by its base name everywhere!)
            # prefix: '%kernel.environment%'

            # Use HttpClient component
            transport_config:
                http_client: 'Psr\Http\Client\ClientInterface'

Finally, inject one of those service (autowirable) in you code where you need it:

JoliCode\Elastically\Client (elastically.default.client)
JoliCode\Elastically\IndexBuilder (elastically.default.index_builder)
JoliCode\Elastically\Indexer (elastically.default.indexer)

Advanced Configuration

Multiple Connections and Autowiring

If you define multiple connections, you can define a default one. This will be useful for autowiring:

elastically:
    default_connection: default
    connections:
        default: # ...
        another: # ...

To use class for other connection, you can use Autowirable Types. To discover them, run:

bin/console debug:autowiring elastically
Use a Custom Serializer Context Builder
elastically:
    default_connection: default
    connections:
        default:
            serializer:
                context_builder_service: App\Elastically\Serializer\ContextBuilder
                # Do not define "context_mapping" option anymore
Use a Custom Mapping provider
elastically:
    default_connection: default
    connections:
        default:
            mapping_provider_service: App\Elastically\MappingProvider
            # Do not define "index_class_mapping" option anymore
Using HttpClient as Transport

You can also use the Symfony HttpClient for all Elastica communications:

JoliCode\Elastically\Transport\HttpClientTransport: ~

JoliCode\Elastically\Client:
    arguments:
        $config:
            hosts:
                - '127.0.0.1:9200'
            transport_config:
                http_client: 'Psr\Http\Client\ClientInterface'

See the official documentation on how to get a PSR-18 client.

Reference

You can run the following command to get the default configuration reference:

bin/console config:dump elastically

Using Messenger for async indexing

Elastically ships with a default Message and Handler for Symfony Messenger.

Register the message in your configuration:

framework:
    messenger:
        transports:
            async: "%env(MESSENGER_TRANSPORT_DSN)%"

        routing:
            # async is whatever name you gave your transport above
            'JoliCode\Elastically\Messenger\IndexationRequest':  async

services:
    JoliCode\Elastically\Messenger\IndexationRequestHandler: ~

The IndexationRequestHandler service depends on an implementation of JoliCode\Elastically\Messenger\DocumentExchangerInterface, which isn't provided by this library. You must provide a service that implements this interface, so you can plug your database or any other source of truth.

Then from your code you have to call:

use JoliCode\Elastically\Messenger\IndexationRequest;
use JoliCode\Elastically\Messenger\IndexationRequestHandler;

$bus->dispatch(new IndexationRequest(Product::class, '1234567890'));

// Third argument is the operation, so for a "delete" add this argument:
// new IndexationRequest(Product::class, 'ref9999', IndexationRequestHandler::OP_DELETE);

And then consume the messages:

php bin/console messenger:consume async

Grouping IndexationRequest in a spool

Sending multiple IndexationRequest during the same Symfony Request is not always appropriate, it will trigger multiple Bulk operations. Elastically provides a Kernel listener to group all the IndexationRequest in a single MultipleIndexationRequest message.

To use this mechanism, we send the IndexationRequest in a memory transport to be consumed and grouped in a really async transport:

messenger:
    transports:
        async: "%env(MESSENGER_TRANSPORT_DSN)%"
        queuing: 'in-memory:///'

    routing:
        'JoliCode\Elastically\Messenger\MultipleIndexationRequest': async
        'JoliCode\Elastically\Messenger\IndexationRequest': queuing

You also need to register the subscriber:

services:
    JoliCode\Elastically\Messenger\IndexationRequestSpoolSubscriber:
        arguments:
            - '@messenger.transport.queuing' # should be the name of the memory transport
            - '@messenger.default_bus'
        tags:
            - { name: kernel.event_subscriber }

Using Jane to build PHP DTO and fast Normalizers

Install JanePHP json-schema tools to build your own DTO and Normalizers. All you have to do is setting the Jane-completed Serializer on the Factory:

$factory = new Factory([
    Factory::CONFIG_SERIALIZER => $serializer,
]);

[!CAUTION] Elastically is not compatible with Jane < 6.

To be done

Sponsors

JoliCode

Open Source time sponsored by JoliCode.