ITISFoundation / osparc-issues

🐼 issue-only repo for the osparc project
3 stars 5 forks source link

Interactive service fail: dy-proxy does not appear appear to be an IPv4 or IPv6 address #1141

Closed elisabettai closed 1 year ago

elisabettai commented 1 year ago

Long Story Short A jupyter-math service on osparc.io fails.

Steps to reproduce

  1. Go to osparc.io
  2. Try to open Is1141 study (uuid 633b54ae-4b05-11ee-869a-02420a0bd26e)

Additional context In the directory v2 logs:

14:46:07,754 | log_source=simcore_service_director_v2.modules.dynamic_sidecar.scheduler._core._observer:observing_single_service(151) | log_uid=None | log_msg=Observation of service_name='dy-sidecar_45d58e45-3926-5a2e-9bd6-3ccc1539d227'  unexpectedly failed [OEC:140533786973984]
Traceback (most recent call last):
  File ""/home/scu/.venv/lib/python3.10/site-packages/anyio/_core/_sockets.py"", line 189, in connect_tcp
    addr_obj = ip_address(remote_host)
  File ""/usr/local/lib/python3.10/ipaddress.py"", line 54, in ip_address
    raise ValueError(f'{address!r} does not appear to be an IPv4 or IPv6 address')
ValueError: 'dy-proxy_45d58e45-3926-5a2e-9bd6-3ccc1539d227' does not appear to be an IPv4 or IPv6 address

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpcore/_exceptions.py"", line 10, in map_exceptions
    yield
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpcore/_backends/anyio.py"", line 114, in connect_tcp
    stream: anyio.abc.ByteStream = await anyio.connect_tcp(
  File ""/home/scu/.venv/lib/python3.10/site-packages/anyio/_core/_sockets.py"", line 192, in connect_tcp
    gai_res = await getaddrinfo(
socket.gaierror: [Errno -2] Name or service not known

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpx/_transports/default.py"", line 60, in map_httpcore_exceptions
    yield
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpx/_transports/default.py"", line 353, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpcore/_async/connection_pool.py"", line 262, in handle_async_request
    raise exc
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpcore/_async/connection_pool.py"", line 245, in handle_async_request
    response = await connection.handle_async_request(request)
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpcore/_async/connection.py"", line 92, in handle_async_request
    raise exc
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpcore/_async/connection.py"", line 69, in handle_async_request
    stream = await self._connect(request)
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpcore/_async/connection.py"", line 117, in _connect
    stream = await self._network_backend.connect_tcp(**kwargs)
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpcore/_backends/auto.py"", line 31, in connect_tcp
    return await self._backend.connect_tcp(
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpcore/_backends/anyio.py"", line 112, in connect_tcp
    with map_exceptions(exc_map):
  File ""/usr/local/lib/python3.10/contextlib.py"", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpcore/_exceptions.py"", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectError: [Errno -2] Name or service not known

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File ""/home/scu/.venv/lib/python3.10/site-packages/simcore_service_director_v2/modules/dynamic_sidecar/api_client/_base.py"", line 70, in request_wrapper
    async for attempt in AsyncRetrying(
  File ""/home/scu/.venv/lib/python3.10/site-packages/tenacity/_asyncio.py"", line 71, in __anext__
    do = self.iter(retry_state=self._retry_state)
  File ""/home/scu/.venv/lib/python3.10/site-packages/tenacity/__init__.py"", line 325, in iter
    raise retry_exc.reraise()
  File ""/home/scu/.venv/lib/python3.10/site-packages/tenacity/__init__.py"", line 158, in reraise
    raise self.last_attempt.result()
  File ""/usr/local/lib/python3.10/concurrent/futures/_base.py"", line 451, in result
    return self.__get_result()
  File ""/usr/local/lib/python3.10/concurrent/futures/_base.py"", line 403, in __get_result
    raise self._exception
  File ""/home/scu/.venv/lib/python3.10/site-packages/simcore_service_director_v2/modules/dynamic_sidecar/api_client/_base.py"", line 79, in request_wrapper
    r: Response = await request_func(zelf, *args, **kwargs)
  File ""/home/scu/.venv/lib/python3.10/site-packages/simcore_service_director_v2/modules/dynamic_sidecar/api_client/_base.py"", line 107, in request_wrapper
    response = await request_func(zelf, *args, **kwargs)
  File ""/home/scu/.venv/lib/python3.10/site-packages/simcore_service_director_v2/modules/dynamic_sidecar/api_client/_thin.py"", line 256, in proxy_config_load
    return await self.client.post(url, json=proxy_configuration)
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpx/_client.py"", line 1848, in post
    return await self.request(
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpx/_client.py"", line 1530, in request
    return await self.send(request, auth=auth, follow_redirects=follow_redirects)
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpx/_client.py"", line 1617, in send
    response = await self._send_handling_auth(
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpx/_client.py"", line 1645, in _send_handling_auth
    response = await self._send_handling_redirects(
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpx/_client.py"", line 1682, in _send_handling_redirects
    response = await self._send_single_request(request)
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpx/_client.py"", line 1719, in _send_single_request
    response = await transport.handle_async_request(request)
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpx/_transports/default.py"", line 352, in handle_async_request
    with map_httpcore_exceptions():
  File ""/usr/local/lib/python3.10/contextlib.py"", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File ""/home/scu/.venv/lib/python3.10/site-packages/httpx/_transports/default.py"", line 77, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectError: [Errno -2] Name or service not known

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File ""/home/scu/.venv/lib/python3.10/site-packages/simcore_service_director_v2/modules/dynamic_sidecar/scheduler/_core/_observer.py"", line 140, in observing_single_service
    await _apply_observation_cycle(scheduler, scheduler_data)
  File ""/home/scu/.venv/lib/python3.10/site-packages/simcore_service_director_v2/modules/dynamic_sidecar/scheduler/_core/_observer.py"", line 69, in _apply_observation_cycle
    await dynamic_scheduler_event.action(app, scheduler_data)
  File ""/home/scu/.venv/lib/python3.10/site-packages/simcore_service_director_v2/modules/dynamic_sidecar/scheduler/_core/_events.py"", line 423, in action
    await create_user_services(app, scheduler_data)
  File ""/home/scu/.venv/lib/python3.10/site-packages/simcore_service_director_v2/modules/dynamic_sidecar/scheduler/_core/_events_user_services.py"", line 197, in create_user_services
    await sidecars_client.configure_proxy(
  File ""/home/scu/.venv/lib/python3.10/site-packages/simcore_service_director_v2/modules/dynamic_sidecar/api_client/_public.py"", line 444, in configure_proxy
    await self._thin_client.proxy_config_load(proxy_endpoint, proxy_configuration)
  File ""/home/scu/.venv/lib/python3.10/site-packages/simcore_service_director_v2/modules/dynamic_sidecar/api_client/_base.py"", line 84, in request_wrapper
    raise ClientHttpError(e) from e
simcore_service_director_v2.modules.dynamic_sidecar.api_client._errors.ClientHttpError",
elisabettai commented 1 year ago

This actually doesn't happen anymore.

GitHK commented 1 year ago

If DNS resolution is not working for a bit, this will fail. Question: did your service end up being shutdown in the UI? If so, nothing that is actionable here.

Unfortunately sometimes, there are failures when starting, but we try to report them. The user can try to run the service again. If this persists then there is an issue. If it's a one time thing, we cannot do much about it.

elisabettai commented 1 year ago

Thanks @GitHK for looking and expanding the traceback!

Question: did your service end up being shutdown in the UI? If so, nothing that is actionable here. The service was labelled as "Failed". I think in the logger I could see that the UI was up and running, but I'm not sure.

It happened 2-3 times on Friday: I tried to restart the service (with the start button) or closing and reopening the study and the service was still failing.

I close for now, it seems there's nothing actionable at this point.