long2ice / meilisync

Realtime sync data from MySQL/PostgreSQL/MongoDB to Meilisearch
https://github.com/long2ice/meilisync
Apache License 2.0
230 stars 34 forks source link
datasync etl meilisearch mongodb mysql postgresql realtime-synchronization

meilisync

image image image image PyPI - Python Version

Introduction

Realtime sync data from MySQL/PostgreSQL/MongoDB to Meilisearch.

There is also a web admin dashboard for meilisync meilisync-admin.

Install

Install from pypi:

Use docker (Recommended)

You can use docker to run meilisync:

version: "3"
services:
  meilisync:
    image: long2ice/meilisync
    volumes:
      - ./config.yml:/meilisync/config.yml
    restart: always

Prerequisites

Quick Start

If you run meilisync without any arguments, it will try to load the configuration from config.yml in the current directory.

❯ meilisync --help

 Usage: meilisync [OPTIONS] COMMAND [ARGS]...

╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --config              -c      TEXT  Config file path [default: config.yml]                                                                                                         │
│ --install-completion                Install completion for the current shell.                                                                                                      │
│ --show-completion                   Show completion for the current shell, to copy it or customize the installation.                                                               │
│ --help                              Show this message and exit.                                                                                                                    │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ check            Check whether the data in the database is consistent with the data in Meilisearch                                                                                 │
│ refresh          Refresh all data by swap index                                                                                                                                    │
│ start            Start meilisync                                                                                                                                                   │
│ version          Show meilisync version                                                                                                                                            │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Start sync

Start sync data from MySQL to Meilisearch:

❯ meilisync start
2023-03-07 08:37:25.656 | INFO     | meilisync.main:_:86 - Start increment sync data from "mysql" to Meilisearch...

Refresh sync

Refresh all data by swap index:

❯ meilisync refresh -t test

Before refresh, you need stop the sync process first to avoid data inconsistency.

Check sync

Check whether the data count in the database is consistent with the data in Meilisearch:

❯ meilisync check -t test

Configuration

Here is an example configuration file:

debug: true
plugins:
  - meilisync.plugin.Plugin
progress:
  type: file
source:
  type: mysql
  host: 192.168.123.205
  port: 3306
  user: root
  password: "123456"
  database: beauty
meilisearch:
  api_url: http://192.168.123.205:7700
  api_key:
  insert_size: 1000
  insert_interval: 10
sync:
  - table: collection
    index: beauty-collections
    plugins:
      - meilisync.plugin.Plugin
    full: true
    fields:
      id:
      title:
      description:
      category:
  - table: picture
    index: beauty-pictures
    full: true
    fields:
      id:
      description:
      category:
sentry:
  dsn: ""
  environment: "production"

debug (optional)

Enable debug mode, default is false, if you want to see more logs, you can set it to true.

plugins (optional)

The plugins are used to customize the data before or after insert to Meilisearch and the plugins is a list of python modules.

Which is a python class with pre_event and post_event methods, the pre_event method is called before insert to Meilisearch, the post_event method is called after insert to Meilisearch.

class Plugin:
    is_global = False

    async def pre_event(self, event: Event):
        logger.debug(f"pre_event: {event}, is_global: {self.is_global}")
        return event

    async def post_event(self, event: Event):
        logger.debug(f"post_event: {event}, is_global: {self.is_global}")
        return event

The is_global is used to indicate whether the plugin instance is global, if set to True, the plugin instance will be created only once, otherwise, the plugin instance will be created for each event.

progress

The progress is used to record the last sync position, such as binlog position for MySQL.

source

Source database configuration, currently only support MySQL and PostgreSQL and MongoDB.

meilisearch

Meilisearch configuration.

If nether insert_size nor insert_interval is set, it will insert each document immediately.

If you prefer performance, just set and increase insert_size and insert_interval. The insert will be made as long as one of the conditions is met.

sync

The sync configuration, you can add multiple sync tasks.

sentry (optional)

Sentry configuration.

License

This project is licensed under the Apache-2.0 License.