almost impossible to read from server connection (very often)
Description of the bug
I have been downloading a dataset of 44T in size. I've noticed that the pyega3 downloader keep retrying and reporting Exception: Slice error: received=5190333, requested=5242880, file=xxxxxx. I thought this could be a network environment problem and I've confirmed there wasn't any.
I then thought this may be a server issue.
I've checked the auth process in this repo, and used curl to test the server connection with test account and test dataset ( ega-test-data@ebi.ac.uk and EGAF00001770106 ). I found that server always terminate before sending the full data, or would not send any data to me at the first place.
below is the curl output example of a early teminated connection, at more situations, the server just won't start to send data.
I used a proxy in this request, but use it or not won't make a difference.
curl -H "Authorization: Bearer eyJraWQiOiJyc2ExIiwiYWxnIjoiUlMyNTYifQ.eyJzdWIiOiJlZ2EtdGVzdC1kYXRhQGViaS5hYy51ayIsImF6cCI6ImYyMGNkMmQzLTY4MmEtNDU2OC1hNTNlLTQyNjJlZjU0YzhmNCIsImlzcyI6Imh0dHBzOlwvXC9lZ2EuZWJpLmFjLnVrOjg0NDNcL2VnYS1vcGVuaWQtY29ubmVjdC1zZXJ2ZXJcLyIsImV4cCI6MTY5ODA3Njk4NSwiaWF0IjoxNjk4MDczMzg1LCJqdGkiOiI2ZWQxYzhkMy00ZmMwLTRjMTQtOTQzMC0yZmExYjI3NGRkODkifQ.c2sixKVTkjKEx5WBeC1VWbsDLwx25Ag2_zTZC47GAYEkAwq3TV-YLOpM9APTrVf6y6Lb8kGfd3lueXYorQcyv2D4u7VeeACxa3FjmAEyv8CUnGogQKQz2oMWJqzsiEKnKJsqyIUV69xX_a6zJXYDZlTULm-aR-HSRBTjwAsMRIc" https://ega.ebi.ac.uk:8443/v2/files/EGAF00001770106 -o /dev/null -v
* About to connect() to proxy 192.168.1.134 port 7776 (#0)
* Trying 192.168.1.134...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Connected to 192.168.1.134 (192.168.1.134) port 7776 (#0)
* Establish HTTP proxy tunnel to ega.ebi.ac.uk:8443
> CONNECT ega.ebi.ac.uk:8443 HTTP/1.1
> Host: ega.ebi.ac.uk:8443
> User-Agent: curl/7.29.0
> Proxy-Connection: Keep-Alive
> Authorization: Bearer eyJraWQiOiJyc2ExIiwiYWxnIjoiUlMyNTYifQ.eyJzdWIiOiJlZ2EtdGVzdC1kYXRhQGViaS5hYy51ayIsImF6cCI6ImYyMGNkMmQzLTY4MmEtNDU2OC1hNTNlLTQyNjJlZjU0YzhmNCIsImlzcyI6Imh0dHBzOlwvXC9lZ2EuZWJpLmFjLnVrOjg0NDNcL2VnYS1vcGVuaWQtY29ubmVjdC1zZXJ2ZXJcLyIsImV4cCI6MTY5ODA3Njk4NSwiaWF0IjoxNjk4MDczMzg1LCJqdGkiOiI2ZWQxYzhkMy00ZmMwLTRjMTQtOTQzMC0yZmExYjI3NGRkODkifQ.c2sixKVTkjKEx5WBeC1VWbsDLwx25Ag2_zTZC47GAYEkAwq3TV-YLOpM9APTrVf6y6Lb8kGfd3lueXYorQcyv2D4u7VeeACxa3FjmAEyv8CUnGogQKQz2oMWJqzsiEKnKJsqyIUV69xX_a6zJXYDZlTULm-aR-HSRBTjwAsMRIc
>
< HTTP/1.1 200 Connection established
<
* Proxy replied OK to CONNECT request
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* CAfile: /etc/pki/tls/certs/ca-bundle.crt
CApath: none
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
* Server certificate:
* subject: CN=ega.ebi.ac.uk,O=European Bioinformatics Institute,ST=Cambridgeshire,C=GB
* start date: Jun 21 00:00:00 2023 GMT
* expire date: Jun 20 23:59:59 2024 GMT
* common name: ega.ebi.ac.uk
* issuer: CN=GEANT OV RSA CA 4,O=GEANT Vereniging,C=NL
> GET /v2/files/EGAF00001770106 HTTP/1.1
> User-Agent: curl/7.29.0
> Host: ega.ebi.ac.uk:8443
> Accept: */*
> Authorization: Bearer eyJraWQiOiJyc2ExIiwiYWxnIjoiUlMyNTYifQ.eyJzdWIiOiJlZ2EtdGVzdC1kYXRhQGViaS5hYy51ayIsImF6cCI6ImYyMGNkMmQzLTY4MmEtNDU2OC1hNTNlLTQyNjJlZjU0YzhmNCIsImlzcyI6Imh0dHBzOlwvXC9lZ2EuZWJpLmFjLnVrOjg0NDNcL2VnYS1vcGVuaWQtY29ubmVjdC1zZXJ2ZXJcLyIsImV4cCI6MTY5ODA3Njk4NSwiaWF0IjoxNjk4MDczMzg1LCJqdGkiOiI2ZWQxYzhkMy00ZmMwLTRjMTQtOTQzMC0yZmExYjI3NGRkODkifQ.c2sixKVTkjKEx5WBeC1VWbsDLwx25Ag2_zTZC47GAYEkAwq3TV-YLOpM9APTrVf6y6Lb8kGfd3lueXYorQcyv2D4u7VeeACxa3FjmAEyv8CUnGogQKQz2oMWJqzsiEKnKJsqyIUV69xX_a6zJXYDZlTULm-aR-HSRBTjwAsMRIc
>
0 0 0 0 0 0 0 0 --:--:-- 0:00:10 --:--:-- 0< HTTP/1.1 200
< Date: Mon, 23 Oct 2023 15:24:35 GMT
< Content-Type: application/octet-stream
< Content-Length: 462139262
< Connection: keep-alive
< Content-Range: bytes 0-462139261/462139262
< X-Content-Type-Options: nosniff
< X-XSS-Protection: 0
< Cache-Control: no-cache, no-store, max-age=0, must-revalidate
< Pragma: no-cache
< Expires: 0
< X-Frame-Options: DENY
<
0 440M 0 0 0 0 0 0 --:--:-- 0:01:11 --:--:-- 0{ [data not shown]
0 440M 0 0 0 0 0 0 --:--:-- 0:01:11 --:--:-- 0* transfer closed with 462139262 bytes remaining to read
0 440M 0 0 0 0 0 0 --:--:-- 0:01:11 --:--:-- 0
* Closing connection 0
curl: (18) transfer closed with 462139262 bytes remaining to read
Used versions (please complete the following information)
Title of the bug
almost impossible to read from server connection (very often)
Description of the bug
I have been downloading a dataset of 44T in size. I've noticed that the pyega3 downloader keep retrying and reporting
Exception: Slice error: received=5190333, requested=5242880, file=xxxxxx
. I thought this could be a network environment problem and I've confirmed there wasn't any. I then thought this may be a server issue. I've checked the auth process in this repo, and used curl to test the server connection with test account and test dataset (ega-test-data@ebi.ac.uk
andEGAF00001770106
). I found that server always terminate before sending the full data, or would not send any data to me at the first place. below is the curl output example of a early teminated connection, at more situations, the server just won't start to send data. I used a proxy in this request, but use it or not won't make a difference.Used versions (please complete the following information)
To Reproduce
token=$(curl -d "grant_type=password&client_id=f20cd2d3-682a-4568-a53e-4262ef54c8f4&client_secret=AMenuDLjVdVo4BSwi0QD54LL6NeVDEZRzEQUJ7hJOM3g4imDZBHHX0hNfKHPeQIGkskhtCmqAJtt_jm7EKq-rWw&username=ega-test-data@ebi.ac.uk&password=egarocks&scope=openid" -H "Content-Type: application/x-www-form-urlencoded" -k https://ega.ebi.ac.uk:8443/ega-openid-connect-server/token | python3 -c "import sys, json; print(json.load(sys.stdin)[\"access_token\"])")
curl -H "Authorization: Bearer $token" https://ega.ebi.ac.uk:8443/v2/files/EGAF00001770106 -o /dev/null -v
Observed behaviour
early terminated connection
Expected behaviour
successful downloading file
Screenshots and error messages
see description