long2ice / meilisync

Realtime sync data from MySQL/PostgreSQL/MongoDB to Meilisearch
https://github.com/long2ice/meilisync
Apache License 2.0
287 stars 43 forks source link

Not Fetching Data when specifying field attribute in config.yml #108

Open SimonLeiner opened 5 months ago

SimonLeiner commented 5 months ago

Describe the bug I successfully Connected meilisync with my MongoDB Atlas Database. If I don't use the "fields" argument in the yaml file, I receive the data, however as soon as I specify the fields, all data is None.

My config.yml:

plugins:
  - meilisync.plugin.Plugin
progress:
  type: file
  path: process.json
source:
  type: mongo
  host: ...
  username:...
  password:...
  database:...
meilisearch:
  api_url: ...
  api_key: ...
  insert_size: 1000
  insert_interval: 10
sync:
  - table: userschemas
    index: userschemas
    full: true
    pk: _id
    fields:
      _id:
      email:
      userName:
      firstName:
      lastName:
      gender:

Example of some dummy Data in Mongo DB: image

Logging without Specification:

image

Logging with Specification:

image

Expected behavior: Should Catch the Data given the provided fields

Desktop (please complete the following information):

bnussbau commented 4 months ago

I have the same problem. When leaving fields empty, it syncs successfully. When specifying fields it seems they are null, leaving the melisearch index empty.

    fields:
      _id: _id
      product_name: product_name

or

    fields:
       _id:
      product_name:

2024-07-02 11:26:24.756 | DEBUG | meilisync.plugin:pre_event:12 - pre_event: progress=None type=<EventType.create: 'create'> table=None data={'_id': 'None', 'product_name': None}, is_global: False

OS: macos 14.5 Python: 3.12.4 Mongodb: 7.0.12 (via brew) Meilisearch: 1.8.3 (self-hosted)

image

bnussbau commented 4 months ago

As a quick workaround: In file https://github.com/long2ice/meilisync/blob/dev/meilisync/source/mongo.py#L26

Replace the variable fields with your preferred fields {"field_name":1} e.g.:

cursor = collection.find({}, { "product_name": 1, "code": 1,"brands": 1, "product_quantity": 1, "product_quantity_unit": 1, "nutriscore_grade": 1, "product_name_de": 1, "nova_group": 1, "categories": 1 })

That did work for me.