kubeflow / pipelines

Machine Learning Pipelines for Kubeflow
https://www.kubeflow.org/docs/components/pipelines/
Apache License 2.0
3.61k stars 1.63k forks source link

Test flakiness because of "unsupported operand type(s) for -=: 'Retry' and 'int'" #2556

Closed Bobgy closed 5 years ago

Bobgy commented 5 years ago

I think this is the top 1 flakiness I have been seeing over a long time.

One stackoverflow answer seems to explain the problem well: https://stackoverflow.com/a/46970344. and the mentioned solution: https://stackoverflow.com/questions/37495375/python-pip-install-throws-typeerror-unsupported-operand-types-for-retry/37531821#37531821

@Ark-kun @numerology, I think we should try to fix this.

Example log

Collecting pyyaml>=3.12 (from kubernetes==9.0.0)
  Downloading https://files.pythonhosted.org/packages/e3/e8/b3212641ee2718d556df0f23f78de8303f068fe29cdaa7a91018849582fe/PyYAML-5.1.2.tar.gz (265kB)
Collecting requests-oauthlib (from kubernetes==9.0.0)
Exception:
Traceback (most recent call last):
  File "/usr/share/python-wheels/urllib3-1.19.1-py2.py3-none-any.whl/urllib3/connectionpool.py", line 594, in urlopen
    chunked=chunked)
  File "/usr/share/python-wheels/urllib3-1.19.1-py2.py3-none-any.whl/urllib3/connectionpool.py", line 391, in _make_request
    six.raise_from(e, None)
  File "<string>", line 2, in raise_from
  File "/usr/share/python-wheels/urllib3-1.19.1-py2.py3-none-any.whl/urllib3/connectionpool.py", line 387, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/lib/python3.5/http/client.py", line 1198, in getresponse
    response.begin()
  File "/usr/lib/python3.5/http/client.py", line 297, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.5/http/client.py", line 258, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/lib/python3.5/socket.py", line 576, in readinto
    return self._sock.recv_into(b)
  File "/usr/lib/python3.5/ssl.py", line 937, in recv_into
    return self.read(nbytes, buffer)
  File "/usr/lib/python3.5/ssl.py", line 799, in read
    return self._sslobj.read(len, buffer)
  File "/usr/lib/python3.5/ssl.py", line 583, in read
    v = self._sslobj.read(len, buffer)
ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/pip/basecommand.py", line 215, in main
    status = self.run(options, args)
  File "/usr/lib/python3/dist-packages/pip/commands/install.py", line 353, in run
    wb.build(autobuilding=True)
  File "/usr/lib/python3/dist-packages/pip/wheel.py", line 749, in build
    self.requirement_set.prepare_files(self.finder)
  File "/usr/lib/python3/dist-packages/pip/req/req_set.py", line 380, in prepare_files
    ignore_dependencies=self.ignore_dependencies))
  File "/usr/lib/python3/dist-packages/pip/req/req_set.py", line 620, in _prepare_file
    session=self.session, hashes=hashes)
  File "/usr/lib/python3/dist-packages/pip/download.py", line 821, in unpack_url
    hashes=hashes
  File "/usr/lib/python3/dist-packages/pip/download.py", line 659, in unpack_http_url
    hashes)
  File "/usr/lib/python3/dist-packages/pip/download.py", line 853, in _download_http_url
    stream=True,
  File "/usr/share/python-wheels/requests-2.12.4-py2.py3-none-any.whl/requests/sessions.py", line 501, in get
    return self.request('GET', url, **kwargs)
  File "/usr/lib/python3/dist-packages/pip/download.py", line 386, in request
    return super(PipSession, self).request(method, url, *args, **kwargs)
  File "/usr/share/python-wheels/requests-2.12.4-py2.py3-none-any.whl/requests/sessions.py", line 488, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/share/python-wheels/requests-2.12.4-py2.py3-none-any.whl/requests/sessions.py", line 609, in send
    r = adapter.send(request, **kwargs)
  File "/usr/share/python-wheels/CacheControl-0.11.7-py2.py3-none-any.whl/cachecontrol/adapter.py", line 47, in send
    resp = super(CacheControlAdapter, self).send(request, **kw)
  File "/usr/share/python-wheels/requests-2.12.4-py2.py3-none-any.whl/requests/adapters.py", line 423, in send
    timeout=timeout
  File "/usr/share/python-wheels/urllib3-1.19.1-py2.py3-none-any.whl/urllib3/connectionpool.py", line 643, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/share/python-wheels/urllib3-1.19.1-py2.py3-none-any.whl/urllib3/util/retry.py", line 315, in increment
    total -= 1
TypeError: unsupported operand type(s) for -=: 'Retry' and 'int'
The command '/bin/sh -c pip3 install kubernetes==9.0.0' returned a non-zero code: 2

/kind bug

Ark-kun commented 5 years ago

I want to note that this exception is actually caused by network failures. It's just that the handling of those failures has error. SO this does not seem to be a cause of failure, but rather reflection of another failure.

Bobgy commented 5 years ago

Designed handling is to retry the network request ( inferred from the error message), so I believe it can greatly reduce flakiness if we fix it.

Bobgy commented 5 years ago

Fixed /close

Please report and reopen if you see this again

k8s-ci-robot commented 5 years ago

@Bobgy: Closing this issue.

In response to [this](https://github.com/kubeflow/pipelines/issues/2556#issuecomment-550961955): >Fixed >/close > >Please report and reopen if you see this again Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.