EGA-archive / ega-download-client

A Python-based EGA download client
Apache License 2.0
94 stars 52 forks source link

Download seems to pause when detaching from screen session #218

Open famosab opened 10 months ago

famosab commented 10 months ago

Edit: Now I also run into the following errors:

urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(253739008 bytes read, 820002816 more expected)', IncompleteRead(253739008 bytes read, 820002816 more expected))
requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(271826432 bytes read, 801915392 more expected)', IncompleteRead(271826432 bytes read, 801915392 more expected))

which leads to the mentioned restart of the download. So maybe the issue is not within the screen session.

Here is a partial output of the log file

[2024-01-09 10:46:04 +0100] retry attempt 5
[2024-01-09 10:46:04 +0100] Download starting [using 20 connection(s), file size 137943558108 and chunk length 1073741824]...
[2024-01-09 10:46:41 +0100] ('Connection broken: IncompleteRead(271826432 bytes read, 801915392 more expected)', IncompleteRead(271826432 bytes read, 801915392 more expected))
Traceback (most recent call last):
  File "/home-link/paifb01/miniconda3/envs/dl/lib/python3.10/site-packages/urllib3/response.py", line 712, in _error_catcher
    yield
  File "/home-link/paifb01/miniconda3/envs/dl/lib/python3.10/site-packages/urllib3/response.py", line 833, in _raw_read
    raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
urllib3.exceptions.IncompleteRead: IncompleteRead(271826432 bytes read, 801915392 more expected)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home-link/paifb01/miniconda3/envs/dl/lib/python3.10/site-packages/requests/models.py", line 816, in generate
    yield from self.raw.stream(chunk_size, decode_content=True)
  File "/home-link/paifb01/miniconda3/envs/dl/lib/python3.10/site-packages/urllib3/response.py", line 934, in stream
    data = self.read(amt=amt, decode_content=decode_content)
  File "/home-link/paifb01/miniconda3/envs/dl/lib/python3.10/site-packages/urllib3/response.py", line 905, in read
    data = self._raw_read(amt)
  File "/home-link/paifb01/miniconda3/envs/dl/lib/python3.10/site-packages/urllib3/response.py", line 811, in _raw_read
    with self._error_catcher():
  File "/home-link/paifb01/miniconda3/envs/dl/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/home-link/paifb01/miniconda3/envs/dl/lib/python3.10/site-packages/urllib3/response.py", line 729, in _error_catcher
    raise ProtocolError(f"Connection broken: {e!r}", e) from e
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(271826432 bytes read, 801915392 more expected)', IncompleteRead(271826432 bytes read, 801915392 more expected))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home-link/paifb01/miniconda3/envs/dl/lib/python3.10/site-packages/pyega3/libs/data_file.py", line 339, in _download_whole_file
    self.download_file(output_file, num_connections, max_slice_size)
  File "/home-link/paifb01/miniconda3/envs/dl/lib/python3.10/site-packages/pyega3/libs/data_file.py", line 158, in download_file
    for part_file_name in executor.map(self.download_file_slice_, params):
  File "/home-link/paifb01/miniconda3/envs/dl/lib/python3.10/concurrent/futures/_base.py", line 621, in result_iterator
    yield _result_or_cancel(fs.pop())
  File "/home-link/paifb01/miniconda3/envs/dl/lib/python3.10/concurrent/futures/_base.py", line 319, in _result_or_cancel
    return fut.result(timeout)
  File "/home-link/paifb01/miniconda3/envs/dl/lib/python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "/home-link/paifb01/miniconda3/envs/dl/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/home-link/paifb01/miniconda3/envs/dl/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home-link/paifb01/miniconda3/envs/dl/lib/python3.10/site-packages/pyega3/libs/data_file.py", line 192, in download_file_slice_
    return self.download_file_slice(*args)
  File "/home-link/paifb01/miniconda3/envs/dl/lib/python3.10/site-packages/pyega3/libs/data_file.py", line 227, in download_file_slice
    for chunk in r.iter_content(DOWNLOAD_FILE_MEMORY_BUFFER_SIZE):
  File "/home-link/paifb01/miniconda3/envs/dl/lib/python3.10/site-packages/requests/models.py", line 818, in generate
    raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(271826432 bytes read, 801915392 more expected)', IncompleteRead(271826432 bytes read, 801915392 more expected))

Download passed time and job time do not match

I am running pyega3 to download one bam file which is around 150GB. Since I am downloading on our hpc, I run the download from within a screen session to which I assign ressources using SLURM commands.

Description of the bug

Everytime I detach my screen and re-attach later to check the progess, the time passed indicated by the download progress is shorter than the total running time of the job. Also the download progress is not as fast I expected when I detach. With my most recent try I saw that the download progress bar reset after downloading less than 10GB

Used versions

To Reproduce

  1. Start a screen session: screen -S dl
  2. Assign ressources to screen session: srun -N 1 --ntasks-per-node=8 --mem=5G --time=2800
  3. Trying to download BAM file EGAF00001074790
    pyega3 -c 20 -ms 1073741824 -cf ./CREDENTIALS_FILE.json  fetch EGAF00001074790
  4. Detach from screen session
  5. Attach again

Observed behaviour

I seems that the progress bar does not continue when detaching. Sometimes the progress even resets to 0%.

Expected behaviour

Continuous download when detaching.

Screenshots and error messages

Screenshot from download bar (11min) image

Joblist showing how long the job has been running (1h 44min) Screenshot 2024-01-09 at 10 36 56

Additional context

I added -c 20 -ms 1073741824 to improve download speed from ~300KB/s to ~8MB/s based on Issue #192.

Yangwanyi1028 commented 10 months ago

I run into the same error, have you solved this problem? It seems like related to 'requests' package in python...

famosab commented 10 months ago

No unfortunately not, I opened a ticket with the EGA helpdesk but they did not have time to have a look yet.

Yangwanyi1028 commented 10 months ago

I use another Download Tool (Live Outbox), and it works fine and really fast! You give it a try! Here’s the link to it: https://ega-archive.org/access/download/files/live-outbox/

2024年1月16日 18:25,Famke Bäuerle @.***> 写道:

No unfortunately not, I opened a ticket with the EGA helpdesk but they did not have time to have a look yet.

— Reply to this email directly, view it on GitHub https://github.com/EGA-archive/ega-download-client/issues/218#issuecomment-1893460815, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUGY2ULM3GFGW7ROJPK525DYOZITNAVCNFSM6AAAAABBSZQLNSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJTGQ3DAOBRGU. You are receiving this because you commented.

famosab commented 10 months ago

That works partially for me. I can connect on my local machine (but not on our hpc) and I can only download whole datasets not single files (as the one that I mentioned in my original comment). @Yangwanyi1028 Do you have any solutions to that?

Yangwanyi1028 commented 10 months ago

I can download a single file in my case, and here's how it work: 1705726583182

famosab commented 10 months ago

Thanks! That worked for me too.

Do you know anything about the files that have .unavailable. in their filenames? Does that mean that I do not have the rights to download? In my case all .bam files have that in their names while the .bai files do not have that.

CsabaHalmagyi commented 4 months ago

@famosab Could you please confirm that this issue is still not solved?

famosab commented 4 months ago

I can check again, I decided to focus on other data first because I did not manage to do the download. Have you updated the client or changed something that I need to know before trying again?