pdphilip / laravel-elasticsearch

Laravel Elasticsearch: An Elasticsearch implementation of Laravel's Eloquent ORM
MIT License
86 stars 16 forks source link

Auto Re-indexing #33

Closed abdokouta closed 1 month ago

abdokouta commented 1 month ago

Is your feature request related to a problem? Please describe. Currently, there seems to be a gap in the package’s functionality regarding when changes to my Eloquent models in Laravel are not automatically reflected in my Elasticsearch index. This requires manual re-indexing or creating custom solutions to ensure data consistency between my database and Elasticsearch, which is time-consuming and error-prone.

Describe the solution you'd like I want the package to support an auto re-indexing feature that automatically updates the Elasticsearch index whenever changes are made to the associated Eloquent models. This should include creating, updating, and deleting records. Ideally, this feature would be configurable to allow developers to enable or disable it as needed, and it should work seamlessly with Laravel's event system.

Describe alternatives you've considered Manual Re-indexing: Running artisan commands or scripts to manually re-index data. This approach is not efficient for real-time applications and increases maintenance overhead. Custom Event Listeners: Implementing custom event listeners in Laravel to handle model changes and update the Elasticsearch index. While this works, it requires additional development and can lead to boilerplate code that the package could potentially handle internally. Scheduled Tasks: Setting up scheduled tasks to periodically re-index data. This approach doesn't provide real-time indexing and can lead to stale data between indexing intervals.

Additional context Auto re-indexing would greatly enhance the usability of this package, particularly for applications with dynamic data. It would also align with Laravel's philosophy of providing elegant and developer-friendly solutions. Below are some screenshots of the current process.

pdphilip commented 1 month ago

Hey @abdokouta

It may help if I better understand the issues you're running into. IE: the details of what changes happened where, and how they affected the index, what custom solutions do you need to create? Etc. The screenshots mentioned did not show.

Taken at face value, I'll share feedback so long:

Re-indexing is common to ES and the package has the tools available to do it safely albeit manually, the docs outline the steps here along with the checks and balances: https://elasticsearch.pdphilip.com/re-indexing

In terms of auto-reindexing, this seems fraught with danger & inefficiencies since it involves several steps to do it safely. Unless I've misunderstood what you mean by auto, this would be akin to changing a field in a MySQL-based Model and Laravel automatically running the migration to accommodate it. ES will already guess the mapping when it finds a new field and add it to the index, but the issue comes when an existing field mapping needs to change. Further, there would be a considerable overhead to checking mapping versions on every create/update and delete, and should a re-index be triggered, depending on your number of records, this could take several minutes (even hours) to execute.

A reasonable & useful enhancement would be to package the manual steps shown in the docs into an artisan command that calls a 'migration type' file and runs through the process safely. This is in line with Laravel's workflow of managing data in general (via migrations).

If you feel I've missed something please share the issues you're running into in more detail. Thanks

abdokouta commented 1 month ago

@pdphilip Thank you for your response! While I haven't used the package yet, I did go through the documentation. Coming from a Magento background, I'm familiar with how well Magento handles indexing and auto-indexing. Magento provides commands to reindex either a specific index or all indexes at once.

Magento also employs a sophisticated reindexing mechanism where it creates temporary indexes to apply the reindexing process. Once the reindexing is complete, Magento replaces the original index with the newly created one and updates the aliases accordingly. This approach ensures that the data is always consistent and available without downtime during the reindexing process.

Similarly, in the Laravel ecosystem, Laravel Scout achieves this with Typesense and Meilisearch, where any addition, update, or deletion of MySQL records is reflected in the index. When it comes to mapping, if you add or modify a field in the database, it's essential to update the Elasticsearch (ES) mapping as well, using ES’s update settings or update mapping methods.

Explanation of Magento Indexing and Reindexing:

In Magento, indexing is a process that converts data (like products, categories, and prices) into index tables, which makes it faster to retrieve and display on the front end. Magento uses indexes to improve the performance of your store by pre-processing complex data and saving it in index tables. For instance, when you update a product price or add a new product, the changes are not immediately visible on the front end until the related index is updated.

How Magento Indexing Works:

Indexing: Whenever data changes (like a product update), Magento needs to process this data so that it can be efficiently retrieved from the database. This processing creates indexes, which are used by the front end to display data quickly. Modes of Indexing: Update on Save: Indexes are updated as soon as any change is made to the data. This ensures that the front end always reflects the most recent data, but it may impact performance during data updates. Update on Schedule: Indexes are updated on a scheduled basis via cron jobs. This mode is more performance-friendly because changes are batched and processed together, but the front end might not immediately reflect the most recent data. Reindexing in Magento:

Reindexing: Reindexing is required when significant changes are made to your data, like adding a new product attribute or changing store configurations. Magento provides CLI commands (bin/magento indexer:reindex) that allow you to reindex either specific indexes or all indexes. This process regenerates the index tables to reflect the latest data. Temporary Indexes and Aliases:

Temporary Indexes: During reindexing, Magento creates temporary indexes rather than updating the original indexes directly. This ensures that the current data remains unaffected and accessible to users during the reindexing process. Replacement and Aliases: Once the reindexing process is complete, Magento replaces the original index with the newly created temporary index and updates the aliases. This seamless switch ensures there’s no downtime or data inconsistency, as the front end always points to the correct and most up-to-date index. Use Cases for Reindexing:

After adding or modifying product attributes. After significant changes in pricing rules or categories. To fix inconsistencies between the data and the index.

pdphilip commented 1 month ago

Hi @abdokouta,

Appreciate the feature request, but it seems like there's a bit of a misunderstanding about what this package is designed for. laravel-elasticsearch is a low-level package that maps Laravel models to Elasticsearch indices (similar to how Laravel models map to MySQL tables) and allows you to use eloquent queries to administer your Elasticsearch indices directly. It's meant to give developers complete control over indexing (from any source), not automate it.

I'd recommend that you use the package first to understand its scope and utility. For your use case, you can implement it by creating (for example) an IndexedProduct model that extends this package, setting up a hybrid hasOne relationship with your MySQL Product model, and using Observers to manage syncing Product and IndexedProduct records. This allows you to leverage Elasticsearch's features effectively.

Hybrid Relations: https://elasticsearch.pdphilip.com/es-mysql

FYI: Having implemented this solution in many of my own projects I am in the process of publishing a package that does exactly this using this package under the hood. I'll link it to you in here once it's published.

Good luck with your project

abdokouta commented 1 month ago

@pdphilip thank you for your response, please check these links https://laravel.com/docs/11.x/scout#adding-records, https://github.com/babenkoivan/elastic-scout-driver/blob/master/src/Engine.php

pdphilip commented 1 week ago

FYI @abdokouta - I have released the following package that does what you're looking to do here: https://github.com/pdphilip/elasticlens