EGA-archive / ega-download-client

A Python-based EGA download client
Apache License 2.0
92 stars 52 forks source link

Md5 mismatch #214

Open BrendaLee1 opened 8 months ago

BrendaLee1 commented 8 months ago

Hi, I tried to download dataset EGAD00001009109, about 1T. The download speed is very low (~2M/s), and the following error always occure: Traceback (most recent call last): File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/urllib3/response.py", line 710, in _error_catcher yield File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/urllib3/response.py", line 835, in _raw_read raise IncompleteRead(self._fp_bytes_read, self.length_remaining) urllib3.exceptions.IncompleteRead: IncompleteRead(5921036 bytes read, 98936564 more expected)

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/requests/models.py", line 816, in generate yield from self.raw.stream(chunk_size, decode_content=True) File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/urllib3/response.py", line 940, in stream data = self.read(amt=amt, decode_content=decode_content) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/urllib3/response.py", line 911, in read data = self._raw_read(amt) ^^^^^^^^^^^^^^^^^^^ File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/urllib3/response.py", line 813, in _raw_read with self._error_catcher(): File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/contextlib.py", line 155, in exit self.gen.throw(typ, value, traceback) File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/urllib3/response.py", line 727, in _error_catcher raise ProtocolError(f"Connection broken: {e!r}", e) from e urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(5921036 bytes read, 98936564 more expected)', IncompleteRead(5921036 bytes read, 98936564 more expected))

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/pyega3/libs/data_file.py", line 323, in download_file_retry self.download_file(output_file, num_connections, max_slice_size) File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/pyega3/libs/data_file.py", line 159, in download_file for part_file_name in executor.map(self.download_fileslice, params): File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/concurrent/futures/_base.py", line 619, in result_iterator yield _result_or_cancel(fs.pop()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/concurrent/futures/_base.py", line 317, in _result_or_cancel return fut.result(timeout) ^^^^^^^^^^^^^^^^^^^ File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/concurrent/futures/_base.py", line 456, in result return self.get_result() ^^^^^^^^^^^^^^^^^^^ File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/concurrent/futures/_base.py", line 401, in get_result raise self._exception File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, *self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/pyega3/libs/data_file.py", line 189, in download_fileslice return self.download_file_slice(args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/pyega3/libs/data_file.py", line 224, in download_file_slice for chunk in r.iter_content(DOWNLOAD_FILE_MEMORY_BUFFER_SIZE): File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/requests/models.py", line 818, in generate raise ChunkedEncodingError(e) requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(5921036 bytes read, 98936564 more expected)', IncompleteRead(5921036 bytes read, 98936564 more expected))

The download can be finished after several rounds of retry, but unfortunately the md5 file is mismatched. I tried old version (4.0.2 and 5.0.1) of pyega3, seems not work.

I wounder if there are alternate options to download data from EGA or any suggestion to fix these problems. Any help will be appreciated.

Jungal10 commented 4 weeks ago

Did you get any solution for this? I am stuck in the same problem here