pwncollege / dojo

Infrastructure powering the pwn.college dojo
https://pwn.college
BSD 2-Clause "Simplified" License
303 stars 100 forks source link

badly handled docker errors #552

Open zardus opened 2 months ago

zardus commented 2 months ago

Seeing lots of these in the logs every once in a while.

 - - [01/Sep/2024:21:29:23 +0000] "GET /workspace/code/stable-effc6e95b4ad1c5ac5f9083ec06663ba4a2e005c?reconnectionToken=4601c7e4-bf17-4cac-9fc7-e2a27b68cf13&reconnection=true&skipWebSocketFrames=false HTTP/1.1" 200 0 "-" "Mozilla/5.0 (X11; CrOS x86_64 14541.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36"
ERROR [CTFd.plugins.dojo_plugin.api.v1.docker] ERROR: Docker failed for 69686:
Traceback (most recent call last):
  File "/opt/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 467, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/opt/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 462, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/local/lib/python3.9/http/client.py", line 1377, in getresponse
    response.begin()
  File "/usr/local/lib/python3.9/http/client.py", line 320, in begin
    version, status, reason = self._read_status()
  File "/usr/local/lib/python3.9/http/client.py", line 281, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/local/lib/python3.9/socket.py", line 704, in readinto
    return self._sock.recv_into(b)
  File "/opt/venv/lib/python3.9/site-packages/gevent/_socketcommon.py", line 696, in recv_into
    self._wait(self._read_event)
  File "src/gevent/_hub_primitives.py", line 317, in gevent._gevent_c_hub_primitives.wait_on_socket
  File "src/gevent/_hub_primitives.py", line 322, in gevent._gevent_c_hub_primitives.wait_on_socket
  File "src/gevent/_hub_primitives.py", line 313, in gevent._gevent_c_hub_primitives._primitive_wait
  File "src/gevent/_hub_primitives.py", line 314, in gevent._gevent_c_hub_primitives._primitive_wait
129.219.21.54 - - [01/Sep/2024:21:29:23 +0000] "GET /dojos HTTP/1.1" 200 23690 "https://pwn.college/dojo/cse365-f2024/course/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36 Edg/128.0.0.0"
  File "src/gevent/_hub_primitives.py", line 46, in gevent._gevent_c_hub_primitives.WaitOperationsGreenlet.wait
  File "src/gevent/_hub_primitives.py", line 46, in gevent._gevent_c_hub_primitives.WaitOperationsGreenlet.wait
  File "src/gevent/_hub_primitives.py", line 55, in gevent._gevent_c_hub_primitives.WaitOperationsGreenlet.wait
  File "src/gevent/_waiter.py", line 154, in gevent._gevent_c_waiter.Waiter.get
  File "src/gevent/_greenlet_primitives.py", line 61, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
  File "src/gevent/_greenlet_primitives.py", line 61, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
  File "src/gevent/_greenlet_primitives.py", line 65, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
  File "src/gevent/_gevent_c_greenlet_primitives.pxd", line 35, in gevent._gevent_c_greenlet_primitives._greenlet_switch
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/venv/lib/python3.9/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
  File "/opt/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 801, in urlopen
    retries = retries.increment(
  File "/opt/venv/lib/python3.9/site-packages/urllib3/util/retry.py", line 552, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/opt/venv/lib/python3.9/site-packages/urllib3/packages/six.py", line 770, in reraise
    raise value
  File "/opt/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 715, in urlopen
    httplib_response = self._make_request(
  File "/opt/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 469, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/opt/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 358, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/CTFd/CTFd/plugins/dojo_plugin/api/v1/docker.py", line 405, in post
    start_challenge(user, dojo_challenge, practice, as_user=as_user)
  File "/opt/CTFd/CTFd/plugins/dojo_plugin/api/v1/docker.py", line 316, in start_challenge
    container = start_container(
  File "/opt/CTFd/CTFd/plugins/dojo_plugin/api/v1/docker.py", line 235, in start_container
    container.start()
  File "/opt/venv/lib/python3.9/site-packages/docker/models/containers.py", line 406, in start
    return self.client.api.start(self.id, **kwargs)
  File "/opt/venv/lib/python3.9/site-packages/docker/utils/decorators.py", line 19, in wrapped
    return f(self, resource_id, *args, **kwargs)
  File "/opt/venv/lib/python3.9/site-packages/docker/api/container.py", line 1126, in start
    res = self._post(url)
  File "/opt/venv/lib/python3.9/site-packages/docker/utils/decorators.py", line 46, in inner
    return f(self, *args, **kwargs)
  File "/opt/venv/lib/python3.9/site-packages/docker/api/client.py", line 233, in _post
    return self.post(url, **self._set_request_timeout(kwargs))
  File "/opt/venv/lib/python3.9/site-packages/requests/sessions.py", line 637, in post
    return self.request("POST", url, data=data, json=json, **kwargs)
  File "/opt/venv/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/opt/venv/lib/python3.9/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/opt/venv/lib/python3.9/site-packages/requests/adapters.py", line 532, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)
zardus commented 2 months ago

This actually happens often and in groups. Might be the docker daemon hanging on something and gunicorn workers failing all at once?