elastic / connectors

Source code for all Elastic connectors, developed by the Search team at Elastic, and home of our Python connector development framework
https://www.elastic.co/guide/en/enterprise-search/master/index.html
Other
58 stars 116 forks source link

encoding with 'idna' codec failed (UnicodeError: label empty or too long) #2617

Open artem-shelkovnikov opened 3 weeks ago

artem-shelkovnikov commented 3 weeks ago

Bug Description

Found in telemetry:

encoding with 'idna' codec failed (UnicodeError: label empty or too long)

 File "/usr/share/enterprise-search/lib/python3.10/site-packages/connectors/sync_job_runner.py", line 148, in execute
    await self.data_provider.ping()
  File "/usr/share/enterprise-search/lib/python3.10/site-packages/connectors/sources/salesforce.py", line 1430, in ping
    await self.salesforce_client.ping()
  File "/usr/share/enterprise-search/lib/python3.10/site-packages/connectors/sources/salesforce.py", line 207, in ping
    await self.session.head(self.base_url)
  File "/usr/share/enterprise-search/lib/python3.10/site-packages/aiohttp/client.py", line 574, in _request
    conn = await self._connector.connect(
  File "/usr/share/enterprise-search/lib/python3.10/site-packages/aiohttp/connector.py", line 544, in connect
    proto = await self._create_connection(req, traces, timeout)
  File "/usr/share/enterprise-search/lib/python3.10/site-packages/aiohttp/connector.py", line 911, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
  File "/usr/share/enterprise-search/lib/python3.10/site-packages/aiohttp/connector.py", line 1173, in _create_direct_connection
    hosts = await asyncio.shield(host_resolved)
  File "/usr/share/enterprise-search/lib/python3.10/site-packages/aiohttp/connector.py", line 884, in _resolve_host
    addrs = await self._resolver.resolve(host, port, family=self._family)
  File "/usr/share/enterprise-search/lib/python3.10/site-packages/aiohttp/resolver.py", line 33, in resolve
    infos = await self._loop.getaddrinfo(
  File "/usr/lib/python3.10/asyncio/base_events.py", line 863, in getaddrinfo
    return await self.run_in_executor(
  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/lib/python3.10/socket.py", line 955, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):

While the code above mentions Salesforce connector, it also happens to other connectors. Looking online, found this thread:

The error can be consistently reproduced when the first substring of the url hostname is greater than 64 characters long, as in "0123456789012345678901234567890123456789012345678901234567890123.example.com". This wouldn't be a problem, except that it doesn't seem to separate out credentials from the first substring of the hostname so the entire "[user]:[secret]@XXX" section must be less than 65 characters long. This is problematic for services that use longer API keys and expect their submission over basic auth.

To Reproduce

Steps to reproduce the behavior:

  1. Create a MongoDB user and password that are 32 characters long each
  2. Set up a MongoDB connector for this user/password
  3. Run a sync
  4. See error

Expected behavior

Connector successfully connects or shows a proper error message

Additional context

This is a common problem between multiple connectors, seen it at least for mongodb and salesforce