chaoss / grimoirelab-perceval

Send Sir Perceval on a quest to retrieve and gather data from software repositories.
http://perceval.readthedocs.io/
GNU General Public License v3.0
289 stars 177 forks source link

IncompleteRead getting data from groups.io #476

Closed canasdiaz closed 5 years ago

canasdiaz commented 5 years ago

I've seen the error below in our Bitergian production instances where we are analyzing the group cloudfoundry+cf-dev

The error:

Error feeding ocean from groupsio (https://groups.io/g/cloudfoundry+cf-dev): ('Connection broken: IncompleteRead(49926748 bytes read)', IncompleteRead(49926748 bytes read))

How to reproduce:

valeriocos commented 5 years ago

thank you @sanacl to report this issue. It seems that there is a problem with the upstream server, in a nutshell, the reading operation on the data being fetched remotely is interrumpted by the server. I left a message on groups.io API group, you can follow the discussion here: https://beta.groups.io/g/api/topic/impossible_to_download/29431590?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,29431590

canasdiaz commented 5 years ago

I've got the same error with a different group using a brand new token. Find the traceback attached just in case it helps:

Error feeding ocean from groupsio (https://groups.io/g/onap+onap-discuss): ('Connection broken: IncompleteRead(179594516 bytes read)', IncompleteRead(179594516 bytes read))
Traceback (most recent call last):
  File "/usr/lib/python3.5/http/client.py", line 541, in _get_chunk_left
    chunk_left = self._read_next_chunk_size()
  File "/usr/lib/python3.5/http/client.py", line 508, in _read_next_chunk_size
    return int(line, 16)
ValueError: invalid literal for int() with base 16: b''
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/lib/python3.5/http/client.py", line 558, in _readall_chunked
    chunk_left = self._get_chunk_left()
  File "/usr/lib/python3.5/http/client.py", line 543, in _get_chunk_left
    raise IncompleteRead(b'')
http.client.IncompleteRead: IncompleteRead(0 bytes read)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/urllib3/response.py", line 360, in _error_catcher
    yield
  File "/usr/local/lib/python3.5/dist-packages/urllib3/response.py", line 438, in read
    data = self._fp.read()
  File "/usr/lib/python3.5/http/client.py", line 455, in read
    return self._readall_chunked()
  File "/usr/lib/python3.5/http/client.py", line 565, in _readall_chunked
    raise IncompleteRead(b''.join(value))
http.client.IncompleteRead: IncompleteRead(179594516 bytes read)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/elk.py", line 207, in feed_backend
    ocean_backend.feed()
  File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/raw/elastic.py", line 202, in feed
    self.feed_items(items)
  File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/raw/elastic.py", line 211, in feed_items
    for item in items:
  File "/usr/local/lib/python3.5/dist-packages/perceval/backend.py", line 127, in fetch
    for item in self.fetch_items(category, **kwargs):
  File "/usr/local/lib/python3.5/dist-packages/perceval/backends/core/groupsio.py", line 104, in fetch_items
    mailing_list.fetch()
  File "/usr/local/lib/python3.5/dist-packages/perceval/backends/core/groupsio.py", line 178, in fetch
    success = self._download_archive(url, payload, filepath)
  File "/usr/local/lib/python3.5/dist-packages/perceval/backends/core/groupsio.py", line 213, in _download_archive
    self._write_archive(r, filepath)
  File "/usr/local/lib/python3.5/dist-packages/perceval/backends/core/groupsio.py", line 227, in _write_archive
    fd.write(r.raw.read())
  File "/usr/local/lib/python3.5/dist-packages/urllib3/response.py", line 459, in read
    raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
  File "/usr/lib/python3.5/contextlib.py", line 77, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.5/dist-packages/urllib3/response.py", line 378, in _error_catcher
    raise ProtocolError('Connection broken: %r' % e, e)
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(179594516 bytes read)', IncompleteRead(179594516 bytes read))
valeriocos commented 5 years ago

@sanacl if you agree we can close this issue, since it was fixed upstream.