elastic / connectors

Source code for all Elastic connectors, developed by the Search team at Elastic, and home of our Python connector development framework
https://www.elastic.co/guide/en/enterprise-search/master/index.html
Other
58 stars 116 forks source link

[Github] `Error while checking for inaccessible repositories. Exception: 403` when trying to sync `private` repositories #2636

Open spong opened 2 weeks ago

spong commented 2 weeks ago

Bug Description

I was trying to sync some internal documentation from the https://github.com/elastic/security-team repo, which is an Elastic private repository (not internal), and if specifying the repo in the List of repositories field within the config, the sync will fail with the following error:

Stack trace

``` console [FMWK][22:44:49][ERROR] [Connector id: sdyRBZABSQy1BdxtPVqF, index name: github-docs, Sync job id: jeLCCpABSQy1BdxtYKnM] Error while checking for inaccessible repositories. Exception: 403, message='Forbidden', url=URL('https://api.github.com/graphql'). Traceback (most recent call last): File "/Users/garrettspong/dev/connectors/connectors/sources/github.py", line 1361, in _get_invalid_repos_for_personal_access_token async for repo in self.github_client.get_org_repos( File "/Users/garrettspong/dev/connectors/connectors/sources/github.py", line 926, in get_org_repos async for response in self.paginated_api_call( File "/Users/garrettspong/dev/connectors/connectors/sources/github.py", line 853, in paginated_api_call response = await self.graphql(query=query, variables=variables) File "/Users/garrettspong/dev/connectors/connectors/utils.py", line 571, in wrapped raise e File "/Users/garrettspong/dev/connectors/connectors/utils.py", line 568, in wrapped return await func(*args, **kwargs) File "/Users/garrettspong/dev/connectors/connectors/sources/github.py", line 779, in graphql return await self._get_client.graphql( File "/Users/garrettspong/dev/connectors/lib/python3.10/site-packages/gidgethub/abc.py", line 264, in graphql status_code, response_headers, response_data = await self._request( File "/Users/garrettspong/dev/connectors/lib/python3.10/site-packages/gidgethub/aiohttp.py", line 19, in _request async with self._session.request( File "/Users/garrettspong/dev/connectors/lib/python3.10/site-packages/aiohttp/client.py", line 1197, in __aenter__ self._resp = await self._coro File "/Users/garrettspong/dev/connectors/lib/python3.10/site-packages/aiohttp/client.py", line 696, in _request resp.raise_for_status() File "/Users/garrettspong/dev/connectors/lib/python3.10/site-packages/aiohttp/client_reqrep.py", line 1070, in raise_for_status raise ClientResponseError( aiohttp.client_exceptions.ClientResponseError: 403, message='Forbidden', url=URL('https://api.github.com/graphql') [FMWK][22:44:49][ERROR] [Connector id: sdyRBZABSQy1BdxtPVqF, index name: github-docs, Sync job id: jeLCCpABSQy1BdxtYKnM] 403, message='Forbidden', url=URL('https://api.github.com/graphql') Traceback (most recent call last): File "/Users/garrettspong/dev/connectors/connectors/sync_job_runner.py", line 167, in execute await self.data_provider.validate_config() File "/Users/garrettspong/dev/connectors/connectors/sources/github.py", line 1466, in validate_config await self._remote_validation() File "/Users/garrettspong/dev/connectors/connectors/sources/github.py", line 1429, in _remote_validation await self._validate_configured_repos() File "/Users/garrettspong/dev/connectors/connectors/sources/github.py", line 1456, in _validate_configured_repos invalid_repos = await self.get_invalid_repos() File "/Users/garrettspong/dev/connectors/connectors/sources/github.py", line 1269, in get_invalid_repos return await self._get_invalid_repos_for_personal_access_token() File "/Users/garrettspong/dev/connectors/connectors/sources/github.py", line 1361, in _get_invalid_repos_for_personal_access_token async for repo in self.github_client.get_org_repos( File "/Users/garrettspong/dev/connectors/connectors/sources/github.py", line 926, in get_org_repos async for response in self.paginated_api_call( File "/Users/garrettspong/dev/connectors/connectors/sources/github.py", line 853, in paginated_api_call response = await self.graphql(query=query, variables=variables) File "/Users/garrettspong/dev/connectors/connectors/utils.py", line 571, in wrapped raise e File "/Users/garrettspong/dev/connectors/connectors/utils.py", line 568, in wrapped return await func(*args, **kwargs) File "/Users/garrettspong/dev/connectors/connectors/sources/github.py", line 779, in graphql return await self._get_client.graphql( File "/Users/garrettspong/dev/connectors/lib/python3.10/site-packages/gidgethub/abc.py", line 264, in graphql status_code, response_headers, response_data = await self._request( File "/Users/garrettspong/dev/connectors/lib/python3.10/site-packages/gidgethub/aiohttp.py", line 19, in _request async with self._session.request( File "/Users/garrettspong/dev/connectors/lib/python3.10/site-packages/aiohttp/client.py", line 1197, in __aenter__ self._resp = await self._coro File "/Users/garrettspong/dev/connectors/lib/python3.10/site-packages/aiohttp/client.py", line 696, in _request resp.raise_for_status() File "/Users/garrettspong/dev/connectors/lib/python3.10/site-packages/aiohttp/client_reqrep.py", line 1070, in raise_for_status raise ClientResponseError( aiohttp.client_exceptions.ClientResponseError: 403, message='Forbidden', url=URL('https://api.github.com/graphql') ```

To Reproduce

Steps to reproduce the behavior:

  1. Setup the Github Connector with the following configuration and sync:

Expected behavior

So long as the access token has access to the repo (which it does), the content should be synced.

Environment

Running Kibana main from source, ES via yarn es snapshot, and Github connector main from source as well.

Additional context

If you configure List of repositories to be *, and provide the repo filter via an Advanced Filter (below), syncing will work without issue.

Advanced Filter
[
  {
    "filter": {
      "pr": "is:pr  label:\"Team:Security Generative AI\""
    },
    "repository": "elastic/security-team"
  },
  {
    "filter": {
      "issue": "is:issue label:\"Team:Security Generative AI\""
    },
    "repository": "elastic/security-team"
  }
]