Closed antonkratz closed 1 year ago
Hi,
I have the same issue when trying to fetch either a dataset of a single file...
Thank you and best, Rui
I have also had this same issue for weeks and have tried adjusting connections -c 1, 5, 10, 20, etc and chunk lengths -ms 1073741824 (default is -ms 104857600) as suggested by others but continue to have issues with the connection........
pyega3 -c 5 -ms 1073741824 -cf EGA_credentials_file.json fetch EGAF0000*******
Download starting [using 5 connection(s), file size 7643952250 and chunk length 1073741824]...
0%| | 0.00/7.64G [00:00<?, ?B/s]
[2023-08-16 12:30:39 -0400] Retrying (Retry(total=19, connect=False, read=9, redirect=None, status=10)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /v2/files/EGAF0000*******?destinationFormat=plain
I have also had this same issue for weeks and have tried adjusting connections -c 1, 5, 10, 20, etc and chunk lengths -ms 1073741824 (default is -ms 104857600) as suggested by others but continue to have issues with the connection........
pyega3 -c 5 -ms 1073741824 -cf EGA_credentials_file.json fetch EGAF0000*******
Download starting [using 5 connection(s), file size 7643952250 and chunk length 1073741824]... 0%| | 0.00/7.64G [00:00<?, ?B/s]
[2023-08-16 12:30:39 -0400] Retrying (Retry(total=19, connect=False, read=9, redirect=None, status=10)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /v2/files/EGAF0000*******?destinationFormat=plain
Previously threads suggested a larger max slice, but I found that it doesn't work lately, because the connection will be interrupted so we need to start from scratch. So instead I tried a smaller chunk length today eg. -ms 50737418, and it seems to be working, though downloading very very slowly with occasion disruption for hours.
I've contacted the help desk, and they said it might take over a month to fix it. I requested to be on the waiting list for Aspera download but it will probably take some time until it is my turn. I must say this is not the best download experience especially I need the data to submit a manuscript pretty soon.
Also having the same issue - don't suppose anyone has found a resolution?
python pyega3 -c 1 -ms 50737418 -cf credential_file.json fetch EGADxxx
[2023-08-17 10:44:07 +0100]
[2023-08-17 10:44:07 +0100] pyEGA3 - EGA python client version 5.0.2 (https://github.com/EGA-archive/ega-download-client)
[2023-08-17 10:44:07 +0100] Parts of this software are derived from pyEGA (https://github.com/blachlylab/pyega) by James Blachly
[2023-08-17 10:44:07 +0100] Python version : 3.11.4
[2023-08-17 10:44:07 +0100] OS version : Linux #1 SMP Tue Jun 20 11:48:01 UTC 2023
[2023-08-17 10:44:07 +0100] Server URL: https://ega.ebi.ac.uk:8443/v2
[2023-08-17 10:44:07 +0100] Session-Id: 2130951956
[2023-08-17 10:44:08 +0100]
[2023-08-17 10:44:08 +0100] Authentication success for user 'x.x@x.x.x.x'
[2023-08-17 10:44:08 +0100] File Id: 'EGAFxxx'(30735778 bytes).
[2023-08-17 10:44:08 +0100] Total space : 26000.00 GiB
[2023-08-17 10:44:08 +0100] Used space : 25450.33 GiB
[2023-08-17 10:44:08 +0100] Free space : 549.67 GiB
[2023-08-17 10:44:08 +0100] Download starting [using 1 connection(s), file size 30735762 and chunk length 50737418]...
0%| | 0.00/30.7M [00:00<?, ?B/s]
EDIT
This actually worked eventually, so it's worth trying a few times and waiting (my dataset was q small though)!
[2023-08-17 10:44:08 +0100] Download starting [using 1 connection(s), file size 30735762 and chunk length 50737418]...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30.7M/30.7M [03:47<00:00, 135kB/s]
[2023-08-17 10:47:56 +0100] Combining file chunks (this operation can take a long time depending on the file size)
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30.7M/30.7M [00:00<00:00, 37.7GB/s]
[2023-08-17 10:47:56 +0100] Calculating md5 (this operation can take a long time depending on the file size)
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30.7M/30.7M [00:00<00:00, 494MB/s]
[2023-08-17 10:47:56 +0100] Verifying file checksum
[2023-08-17 10:47:56 +0100] Saved to : '/well/ckb/users/aey472/EGAF00005858167/Kutanan_Liu_2021_Thai_Lao.tar.gz'(30735762 bytes, md5=6a78f1d316572c3c8f21ec73faa1036f)
[2023-08-17 10:47:56 +0100] Download complete
Also having the same issue - don't suppose anyone has found a resolution?
python pyega3 -c 1 -ms 50737418 -cf credential_file.json fetch EGADxxx [2023-08-17 10:44:07 +0100] [2023-08-17 10:44:07 +0100] pyEGA3 - EGA python client version 5.0.2 (https://github.com/EGA-archive/ega-download-client) [2023-08-17 10:44:07 +0100] Parts of this software are derived from pyEGA (https://github.com/blachlylab/pyega) by James Blachly [2023-08-17 10:44:07 +0100] Python version : 3.11.4 [2023-08-17 10:44:07 +0100] OS version : Linux #1 SMP Tue Jun 20 11:48:01 UTC 2023 [2023-08-17 10:44:07 +0100] Server URL: https://ega.ebi.ac.uk:8443/v2 [2023-08-17 10:44:07 +0100] Session-Id: 2130951956 [2023-08-17 10:44:08 +0100] [2023-08-17 10:44:08 +0100] Authentication success for user 'x.x@x.x.x.x' [2023-08-17 10:44:08 +0100] File Id: 'EGAFxxx'(30735778 bytes). [2023-08-17 10:44:08 +0100] Total space : 26000.00 GiB [2023-08-17 10:44:08 +0100] Used space : 25450.33 GiB [2023-08-17 10:44:08 +0100] Free space : 549.67 GiB [2023-08-17 10:44:08 +0100] Download starting [using 1 connection(s), file size 30735762 and chunk length 50737418]... 0%| | 0.00/30.7M [00:00<?, ?B/s]
EDIT
This actually worked eventually, so it's worth trying a few times and waiting (my dataset was q small though)!
[2023-08-17 10:44:08 +0100] Download starting [using 1 connection(s), file size 30735762 and chunk length 50737418]... 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30.7M/30.7M [03:47<00:00, 135kB/s] [2023-08-17 10:47:56 +0100] Combining file chunks (this operation can take a long time depending on the file size) 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30.7M/30.7M [00:00<00:00, 37.7GB/s] [2023-08-17 10:47:56 +0100] Calculating md5 (this operation can take a long time depending on the file size) 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30.7M/30.7M [00:00<00:00, 494MB/s] [2023-08-17 10:47:56 +0100] Verifying file checksum [2023-08-17 10:47:56 +0100] Saved to : '/well/ckb/users/aey472/EGAF00005858167/Kutanan_Liu_2021_Thai_Lao.tar.gz'(30735762 bytes, md5=6a78f1d316572c3c8f21ec73faa1036f) [2023-08-17 10:47:56 +0100] Download complete
Hey sahwa, by "eventually" you mean after several hours, or several days?
I am stuck in my research project, there is no way to move forward because I do not even have access to the data, I signed a non disclosure agreement etc, but after all I have no data.
Also having the same issue - don't suppose anyone has found a resolution?
python pyega3 -c 1 -ms 50737418 -cf credential_file.json fetch EGADxxx [2023-08-17 10:44:07 +0100] [2023-08-17 10:44:07 +0100] pyEGA3 - EGA python client version 5.0.2 (https://github.com/EGA-archive/ega-download-client) [2023-08-17 10:44:07 +0100] Parts of this software are derived from pyEGA (https://github.com/blachlylab/pyega) by James Blachly [2023-08-17 10:44:07 +0100] Python version : 3.11.4 [2023-08-17 10:44:07 +0100] OS version : Linux #1 SMP Tue Jun 20 11:48:01 UTC 2023 [2023-08-17 10:44:07 +0100] Server URL: https://ega.ebi.ac.uk:8443/v2 [2023-08-17 10:44:07 +0100] Session-Id: 2130951956 [2023-08-17 10:44:08 +0100] [2023-08-17 10:44:08 +0100] Authentication success for user 'x.x@x.x.x.x' [2023-08-17 10:44:08 +0100] File Id: 'EGAFxxx'(30735778 bytes). [2023-08-17 10:44:08 +0100] Total space : 26000.00 GiB [2023-08-17 10:44:08 +0100] Used space : 25450.33 GiB [2023-08-17 10:44:08 +0100] Free space : 549.67 GiB [2023-08-17 10:44:08 +0100] Download starting [using 1 connection(s), file size 30735762 and chunk length 50737418]... 0%| | 0.00/30.7M [00:00<?, ?B/s]
EDIT This actually worked eventually, so it's worth trying a few times and waiting (my dataset was q small though)!
[2023-08-17 10:44:08 +0100] Download starting [using 1 connection(s), file size 30735762 and chunk length 50737418]... 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30.7M/30.7M [03:47<00:00, 135kB/s] [2023-08-17 10:47:56 +0100] Combining file chunks (this operation can take a long time depending on the file size) 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30.7M/30.7M [00:00<00:00, 37.7GB/s] [2023-08-17 10:47:56 +0100] Calculating md5 (this operation can take a long time depending on the file size) 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30.7M/30.7M [00:00<00:00, 494MB/s] [2023-08-17 10:47:56 +0100] Verifying file checksum [2023-08-17 10:47:56 +0100] Saved to : '/well/ckb/users/aey472/EGAF00005858167/Kutanan_Liu_2021_Thai_Lao.tar.gz'(30735762 bytes, md5=6a78f1d316572c3c8f21ec73faa1036f) [2023-08-17 10:47:56 +0100] Download complete
Hey sahwa, by "eventually" you mean after several hours, or several days?
I am stuck in my research project, there is no way to move forward because I do not even have access to the data, I signed a non disclosure agreement etc, but after all I have no data.
It took me about an hour or two of just cancelling the comand and resubmitting and eventually worked. All done within a day - but the files were pretty tiny ~30Mb or so, so I don't know how long it would take for larger files.
Surprisingly I am able to start downloading the datasets in recent days. I found that a smaller -ms actually works. It still gets disconnected every now and then, but at least it restarts in a certain slice. Here is my code:
pyega3 -cf /your_config_file.json -c 20 -ms 100000000 -d fetch EGADxxxxxxxx --output-dir /your_output_directory -M 1000
I tried same script today and it works now!
I tried same script today and it works now!
Thank you so much for the heads up Guan06, pyega3 fetch EGAD00001001991
started on my machine as well! I will report if it manages to download the entire thing.
Dear commenters, Thank you for submitting your error logs and reporting the issue. I can confirm that we had a connection problem that made downloads very slow/not possible. This issue has been fixed by the dev team recently. I am closing this issue now but should you encounter any errors in the future, please contact our helpdesk team at helpdesk@ega-archive.org. Regards, Csaba
Hi @CsabaHalmagyi , thank you very much for looking into this. I can still not download successfully however, I tried
pyega3 -c 30 -cf /home/kratz/ega_credentials.json fetch EGAD00001006959 --output-dir ~/Lifelines-TEST/EGAD00001006959/
but I got [2023-08-24 22:39:38 +0900] Download process expected md5 value 'ceb13b8005cbbb7ad50fcd6184d3f300' but got '7973a8bbaa7930409f0216c4233a220e'
followed by Python error messages and then pyEGA3 crashes. I assume this is an unrelated error, but so far I have no success downloading a very large archive (EGAD00001001991, over 3 terabyte, way over 1500 files), Lifelines DEEP data. Would Aspera help? I have already written to helpdesk@ega-archive.org.
Hi all, Last week, when trying to download a dataset, I was having the same "download stuck at 0%" issue. Now, whenever I try to download a file (around 30Gb), the download does start and reach 100% but the md5 checksum always fail, as it does for @antonkratz. I also sometimes get slice errors during the downloads. I had contacted the helpdesk about the 0% issue and would like an Aspera download link if possible to solve the situation.
I start the client like this:
pyega3 fetch EGAD00001001991
I enter my credentials and am stuck for several hours with:
[2023-08-12 19:54:19 +0900] Download starting [using 1 connection(s), file size 2813259451 and chunk length 104857600]...
How can I proceed from here?