Cloud-CV / evalai-cli

:cloud: :rocket: Official EvalAI Command Line Tool
https://cli.eval.ai
BSD 3-Clause "New" or "Revised" License
55 stars 63 forks source link

socket.timeout error when submitting image using `evalai push` #275

Open RishabhJain2018 opened 4 years ago

RishabhJain2018 commented 4 years ago

When submitting a new docker image for the 1st time I often get the following error:

$ evalai push objectnet_dmayo:v3--phase objectnet-objectnet-express-293

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 362, in _error_catcher
    yield
  File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 444, in read
    data = self._fp.read(amt)
  File "/usr/lib/python3.6/http/client.py", line 459, in read
    n = self.readinto(b)
  File "/usr/lib/python3.6/http/client.py", line 493, in readinto
    return self._readinto_chunked(b)
  File "/usr/lib/python3.6/http/client.py", line 588, in _readinto_chunked
    chunk_left = self._get_chunk_left()
  File "/usr/lib/python3.6/http/client.py", line 556, in _get_chunk_left
    chunk_left = self._read_next_chunk_size()
  File "/usr/lib/python3.6/http/client.py", line 516, in _read_next_chunk_size
    line = self.fp.readline(_MAXLINE + 1)
  File "/usr/lib/python3.6/socket.py", line 586, in readinto
    return self._sock.recv_into(b)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/evalai", line 11, in <module>
    sys.exit(main())
  File "/usr/lib/python3/dist-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3/dist-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/lib/python3/dist-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3/dist-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/evalai/submissions.py", line 172, in push
    repository_uri, tag, stream=True, decode=True
  File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 324, in _stream_helper
    for chunk in json_stream(self._stream_helper(response, False)):
  File "/usr/local/lib/python3.6/dist-packages/docker/utils/json_stream.py", line 66, in split_buffer
    for data in stream_as_text(stream):
  File "/usr/local/lib/python3.6/dist-packages/docker/utils/json_stream.py", line 22, in stream_as_text
    for data in stream:
  File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 330, in _stream_helper
    data = reader.read(1)
  File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 461, in read
    raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
  File "/usr/lib/python3.6/contextlib.py", line 99, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 367, in _error_catcher
    raise ReadTimeoutError(self._pool, None, 'Read timed out.')
urllib3.exceptions.ReadTimeoutError: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out.

Repeating the push immediately after is usually success - presumably because most of the docker layers have been cached at the server.

Things already tried and didn't work-

  1. Maybe setting DOCKER_CLIENT_TIMEOUT will help: docker/compose#6837
  2. Tried setting following env vars:
    DOCKER_CLIENT_TIMEOUT=600
    COMPOSE_HTTP_TIMEOUT=600
  3. Also tried setting following in /etc/docker/daemon.json
    "max-concurrent-uploads": 1,
RishabhJain2018 commented 4 years ago

@Ayukha Can you please look into this?

Ayukha commented 4 years ago

@RishabhJain2018 sure

RishabhJain2018 commented 4 years ago

One more thing, the docker images that gave this error was of around 7.5 GB in size.