elastic / connectors

Official Elastic connectors for third-party data sources
https://www.elastic.co/guide/en/elasticsearch/reference/master/es-connectors.html
Other
19 stars 138 forks source link

[MongoDB] Documents stop being added after 60 seconds. #2942

Open vitavitaliy3 opened 2 weeks ago

vitavitaliy3 commented 2 weeks ago

Bug Description

Documents stop being added to the search index 60 seconds after synchronization starts. But the time counter does not stop at "Sync duration". It stops after ~16 minutes.

After the stop : dial tcp 10.xx.xx.xx:9200: connect: connection timed out

tested on different versions of connectors : 8.14.0.0 8.15.2.0

tested on different versions of elasticsearch: 8.12.1 8.15.1

connection timed out

Image

vitavitaliy3 commented 2 weeks ago

warning in logs :

[job_scheduling_service] can't compare offset-naive and offset-aware datetimes

seanstory commented 2 weeks ago

Hi @vitavitaliy3 , thanks for reporting. This is interesting.

dial tcp 10.xx.xx.xx:9200: connect: connection timed out

I'm inferring that this is Elasticsearch that is timing out? Your screenshot shows port 27017, which I'm less confident in. If this is the Elasticsearch host, have you looked in the Elasticsearch logs?

Are you using a custom ingest pipeline for your connector? If so, what all are you doing in it? Are you using any machine learning features?

Documents stop being added to the search index 60 seconds after synchronization starts. But the time counter does not stop at "Sync duration". It stops after ~16 minutes.

How are you evaluating this? Are you just polling the count of the index and not seeing changes for the last 15 min? Or are you seeing this from connector logs? If you can share your connector logs, that would be helpful.

Our connectors should transition a job to an Error state if it acts hung for too long. See:

##  Maximal interval of time during which MemQueue does not dequeue a single document
##  For example, if no documents were sent to Elasticsearch within 60 seconds because of
##  Elasticsearch being overloaded, then an error will be raised.
##  This mechanism exists to be a circuit-breaker for stuck jobs and stuck Elasticsearch.
#elasticsearch.bulk.queue_refresh_timeout: 60

So either this isn't working as expected (bug) or something is happening for those last 15 minutes, and it's just not as visible to you as would be useful.

artem-shelkovnikov commented 2 weeks ago

Hi @vitavitaliy3,

Additionally to what Sean has asked I'll add a couple questions from me:

  1. Is your MongoDB server healthy CPU-wise and latency wise?
  2. Do you self-host it or use a cloud version?
  3. If it's happening in the same place, could it be that your MongoDB contains really large payloads that time out due to slow connection/slow server?
  4. Anything that you can see in the MongoDB server logs?