long2ice / meilisync

Realtime sync data from MySQL/PostgreSQL/MongoDB to Meilisearch
https://github.com/long2ice/meilisync
Apache License 2.0
260 stars 40 forks source link

Support doing a full sync without deleting the existing index #81

Open matinone opened 8 months ago

matinone commented 8 months ago

This PR adds a new command line argument (--keep-index) to the refresh command, to support doing a full sync without deleting the existing index. Currently, when doing a full sync the index is recreated and all the existing data from the database table is added to the index. In this context, it makes perfect sense to delete and create again the index, because otherwise the deleted items in the database table would remain in the index. It also takes advantage of the swap index feature to perform the update without downtime.

But if you want to sync multiple tables to the same index (so you can search everything you need in a single index afterwards), then that process doesn't work if you want to do a full sync of all the tables to that same index, because each table sync would delete the index and overwrite the previous content, so at the end only one table would have been synced to the index. This is why having this additional option to preserve the index would be useful. To sync all the tables, you run the refresh command once without the --keep-index option for one table (it would work in exactly the same way as it works now):

meilisync refresh -t table_1

And then you run it with the --keep-index flag for the rest of the tables you want to sync to the same index:

meilisync refresh --keep-index -t table_2
meilisync refresh --keep-index -t table_3

The PR also fixes a small bug where the table name wouldn't be available in the plugins when events are generated as a result of a full sync (see the change in the async def add_data(self, sync: Sync, data: list) method).