epam / badgerdoc

Apache License 2.0
33 stars 32 forks source link

Small issues in 1.8.0 #875

Open templier2 opened 3 months ago

templier2 commented 3 months ago

1) The annotation service can't connect to Kafka during startup:

INFO:     Started server process [9]
27-Jun-24 10:24:10 - [INFO] - uvicorn.error - (server.py).serve(84) - Started server process [9]
INFO:     Waiting for application startup.
27-Jun-24 10:24:10 - [INFO] - uvicorn.error - (on.py).startup(45) - Waiting for application startup.
27-Jun-24 10:24:10 - [WARNING] - kafka.conn - (conn.py).dns_lookup(1527) - DNS lookup failed for badgerdoc-kafka:9092, exception was [Errno -3] Temporary failure in name resolution. Is your advertised.listeners (called advertised.host.name before Kafka 9) correct and resolvable?
27-Jun-24 10:24:10 - [ERROR] - kafka.conn - (conn.py)._dns_lookup(315) - DNS lookup failed for badgerdoc-kafka:9092 (AddressFamily.AF_UNSPEC)
27-Jun-24 10:24:10 - [INFO] - kafka.conn - (conn.py).check_version(1205) - Probing node bootstrap-0 broker version
27-Jun-24 10:24:10 - [WARNING] - kafka.conn - (conn.py).dns_lookup(1527) - DNS lookup failed for badgerdoc-kafka:9092, exception was [Errno -3] Temporary failure in name resolution. Is your advertised.listeners (called advertised.host.name before Kafka 9) correct and resolvable?
27-Jun-24 10:24:10 - [ERROR] - kafka.conn - (conn.py)._dns_lookup(315) - DNS lookup failed for badgerdoc-kafka:9092 (AddressFamily.AF_UNSPEC)
27-Jun-24 10:24:10 - [WARNING] - annotation.logger - (main.py)._init_search_annotation_producer(1016) - Error occurred during kafka producer creating: NoBrokersAvailable
INFO:     Application startup complete.
27-Jun-24 10:24:10 - [INFO] - uvicorn.error - (on.py).startup(59) - Application startup complete.

2) The jobs service is constantly trying to connect with pipelines:

02-Jul-24 14:35:28 - [ERROR] - jobs.logger - (utils.py).get_job_progress(573) - Failed request url = http://badgerdoc-pipelines:8080/jobs/16/progress, error = Cannot connect to host badgerdoc-pipelines:8080 ssl:default [Name or service not known]
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/aiohttp/connector.py", line 1203, in _create_direct_connection
    hosts = await self._resolve_host(host, port, traces=traces)
  File "/usr/local/lib/python3.8/site-packages/aiohttp/connector.py", line 880, in _resolve_host
    return await asyncio.shield(resolved_host_task)
  File "/usr/local/lib/python3.8/site-packages/aiohttp/connector.py", line 917, in _resolve_host_with_throttle
    addrs = await self._resolver.resolve(host, port, family=self._family)
  File "/usr/local/lib/python3.8/site-packages/aiohttp/resolver.py", line 33, in resolve
    infos = await self._loop.getaddrinfo(
  File "uvloop/loop.pyx", line 1528, in getaddrinfo
socket.gaierror: [Errno -2] Name or service not known

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/jobs/./jobs/utils.py", line 569, in get_job_progress
    _, response = await fetch(
  File "/opt/jobs/./jobs/utils.py", line 591, in fetch
    async with aiohttp.request(
  File "/usr/local/lib/python3.8/site-packages/aiohttp/client.py", line 1246, in __aenter__
    self._resp = await self._coro
  File "/usr/local/lib/python3.8/site-packages/aiohttp/client.py", line 581, in _request
    conn = await self._connector.connect(
  File "/usr/local/lib/python3.8/site-packages/aiohttp/connector.py", line 544, in connect
    proto = await self._create_connection(req, traces, timeout)
  File "/usr/local/lib/python3.8/site-packages/aiohttp/connector.py", line 944, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
  File "/usr/local/lib/python3.8/site-packages/aiohttp/connector.py", line 1209, in _create_direct_connection
    raise ClientConnectorError(req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host badgerdoc-pipelines:8080 ssl:default [Name or service not known]
INFO:     172.20.0.13:38578 - "POST /jobs/jobs/progress HTTP/1.1" 422 Unprocessable Entity