elastic / connectors

Source code for all Elastic connectors, developed by the Search team at Elastic, and home of our Python connector development framework
https://www.elastic.co/guide/en/enterprise-search/master/index.html
Other
58 stars 117 forks source link

Github Sync job gets terminated if the API token is rate limited at start #2572

Open gcasar opened 1 month ago

gcasar commented 1 month ago

Bug Description

Given a github API token that is reused in another place and has its rate limits hit When a new Sync job is triggered it will be terminated by the orchestrator

To Reproduce

Steps to reproduce the behavior:

  1. Use a Github token to sync a large repository
  2. Once the Token has hit the rate limit create a new connector with the same one and schedule a sync job
  3. After about 5 minutes the job will be terminated and marked as failed

Expected behavior

Sync job should wait for data to start coming in, observing rate limits and not terminated by the scheduler.

Environment

8.13.2, using a Github cloud native connector

praveen-elastic commented 1 month ago

Thanks for reporting this issue. @parthpuri-elastic will work on it and try to resolve this

parthpuri-elastic commented 1 month ago

Yes, working on it

khushbu-elastic commented 1 month ago

@gcasar We tried to reproduce the issue but it is working as expected and the issue is not reproducible. We waited till the retry time (as mentioned in the logs) and successfully able to complete the sync. Attaching the log file for reference.

github.log

gcasar commented 1 month ago

Thank you for looking into this! Peeking at the logs you seem to be using the connector version 8.14 which is not yet available for the cloud. I'll try to reproduce it once the release happens👍

khushbu-elastic commented 1 week ago

@gcasar Is there any update on this issue? Please let us know so that we can update the status accordingly. CC: @seanstory

seanstory commented 1 week ago

This seems like it should be easy to mock, if not reproduce. I'd expect the problem is around ping() raising an error. I'm not sure that this is a bug though. If you can't establish a connection during setup, I think the error state is the right place to be. In this scenario, it failed fast to let you see that two connectors were sharing a rate limit, which is probably not what you want, right @gcasar ?