EGA-archive / ega-download-client

A Python-based EGA download client
Apache License 2.0
93 stars 52 forks source link

[Server Problem] almost impossible to read from server connection #209

Closed TTTPOB closed 11 months ago

TTTPOB commented 11 months ago

Title of the bug

almost impossible to read from server connection (very often)

Description of the bug

I have been downloading a dataset of 44T in size. I've noticed that the pyega3 downloader keep retrying and reporting Exception: Slice error: received=5190333, requested=5242880, file=xxxxxx. I thought this could be a network environment problem and I've confirmed there wasn't any. I then thought this may be a server issue. I've checked the auth process in this repo, and used curl to test the server connection with test account and test dataset ( ega-test-data@ebi.ac.uk and EGAF00001770106 ). I found that server always terminate before sending the full data, or would not send any data to me at the first place. below is the curl output example of a early teminated connection, at more situations, the server just won't start to send data. I used a proxy in this request, but use it or not won't make a difference.

curl -H "Authorization: Bearer eyJraWQiOiJyc2ExIiwiYWxnIjoiUlMyNTYifQ.eyJzdWIiOiJlZ2EtdGVzdC1kYXRhQGViaS5hYy51ayIsImF6cCI6ImYyMGNkMmQzLTY4MmEtNDU2OC1hNTNlLTQyNjJlZjU0YzhmNCIsImlzcyI6Imh0dHBzOlwvXC9lZ2EuZWJpLmFjLnVrOjg0NDNcL2VnYS1vcGVuaWQtY29ubmVjdC1zZXJ2ZXJcLyIsImV4cCI6MTY5ODA3Njk4NSwiaWF0IjoxNjk4MDczMzg1LCJqdGkiOiI2ZWQxYzhkMy00ZmMwLTRjMTQtOTQzMC0yZmExYjI3NGRkODkifQ.c2sixKVTkjKEx5WBeC1VWbsDLwx25Ag2_zTZC47GAYEkAwq3TV-YLOpM9APTrVf6y6Lb8kGfd3lueXYorQcyv2D4u7VeeACxa3FjmAEyv8CUnGogQKQz2oMWJqzsiEKnKJsqyIUV69xX_a6zJXYDZlTULm-aR-HSRBTjwAsMRIc"  https://ega.ebi.ac.uk:8443/v2/files/EGAF00001770106 -o /dev/null -v
* About to connect() to proxy 192.168.1.134 port 7776 (#0)
*   Trying 192.168.1.134...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0* Connected to 192.168.1.134 (192.168.1.134) port 7776 (#0)
* Establish HTTP proxy tunnel to ega.ebi.ac.uk:8443
> CONNECT ega.ebi.ac.uk:8443 HTTP/1.1
> Host: ega.ebi.ac.uk:8443
> User-Agent: curl/7.29.0
> Proxy-Connection: Keep-Alive
> Authorization: Bearer eyJraWQiOiJyc2ExIiwiYWxnIjoiUlMyNTYifQ.eyJzdWIiOiJlZ2EtdGVzdC1kYXRhQGViaS5hYy51ayIsImF6cCI6ImYyMGNkMmQzLTY4MmEtNDU2OC1hNTNlLTQyNjJlZjU0YzhmNCIsImlzcyI6Imh0dHBzOlwvXC9lZ2EuZWJpLmFjLnVrOjg0NDNcL2VnYS1vcGVuaWQtY29ubmVjdC1zZXJ2ZXJcLyIsImV4cCI6MTY5ODA3Njk4NSwiaWF0IjoxNjk4MDczMzg1LCJqdGkiOiI2ZWQxYzhkMy00ZmMwLTRjMTQtOTQzMC0yZmExYjI3NGRkODkifQ.c2sixKVTkjKEx5WBeC1VWbsDLwx25Ag2_zTZC47GAYEkAwq3TV-YLOpM9APTrVf6y6Lb8kGfd3lueXYorQcyv2D4u7VeeACxa3FjmAEyv8CUnGogQKQz2oMWJqzsiEKnKJsqyIUV69xX_a6zJXYDZlTULm-aR-HSRBTjwAsMRIc
>
< HTTP/1.1 200 Connection established
<
* Proxy replied OK to CONNECT request
* Initializing NSS with certpath: sql:/etc/pki/nssdb
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0* SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
* Server certificate:
*       subject: CN=ega.ebi.ac.uk,O=European Bioinformatics Institute,ST=Cambridgeshire,C=GB
*       start date: Jun 21 00:00:00 2023 GMT
*       expire date: Jun 20 23:59:59 2024 GMT
*       common name: ega.ebi.ac.uk
*       issuer: CN=GEANT OV RSA CA 4,O=GEANT Vereniging,C=NL
> GET /v2/files/EGAF00001770106 HTTP/1.1
> User-Agent: curl/7.29.0
> Host: ega.ebi.ac.uk:8443
> Accept: */*
> Authorization: Bearer eyJraWQiOiJyc2ExIiwiYWxnIjoiUlMyNTYifQ.eyJzdWIiOiJlZ2EtdGVzdC1kYXRhQGViaS5hYy51ayIsImF6cCI6ImYyMGNkMmQzLTY4MmEtNDU2OC1hNTNlLTQyNjJlZjU0YzhmNCIsImlzcyI6Imh0dHBzOlwvXC9lZ2EuZWJpLmFjLnVrOjg0NDNcL2VnYS1vcGVuaWQtY29ubmVjdC1zZXJ2ZXJcLyIsImV4cCI6MTY5ODA3Njk4NSwiaWF0IjoxNjk4MDczMzg1LCJqdGkiOiI2ZWQxYzhkMy00ZmMwLTRjMTQtOTQzMC0yZmExYjI3NGRkODkifQ.c2sixKVTkjKEx5WBeC1VWbsDLwx25Ag2_zTZC47GAYEkAwq3TV-YLOpM9APTrVf6y6Lb8kGfd3lueXYorQcyv2D4u7VeeACxa3FjmAEyv8CUnGogQKQz2oMWJqzsiEKnKJsqyIUV69xX_a6zJXYDZlTULm-aR-HSRBTjwAsMRIc
>
  0     0    0     0    0     0      0      0 --:--:--  0:00:10 --:--:--     0< HTTP/1.1 200
< Date: Mon, 23 Oct 2023 15:24:35 GMT
< Content-Type: application/octet-stream
< Content-Length: 462139262
< Connection: keep-alive
< Content-Range: bytes 0-462139261/462139262
< X-Content-Type-Options: nosniff
< X-XSS-Protection: 0
< Cache-Control: no-cache, no-store, max-age=0, must-revalidate
< Pragma: no-cache
< Expires: 0
< X-Frame-Options: DENY
<
  0  440M    0     0    0     0      0      0 --:--:--  0:01:11 --:--:--     0{ [data not shown]
  0  440M    0     0    0     0      0      0 --:--:--  0:01:11 --:--:--     0* transfer closed with 462139262 bytes remaining to read
  0  440M    0     0    0     0      0      0 --:--:--  0:01:11 --:--:--     0
* Closing connection 0
curl: (18) transfer closed with 462139262 bytes remaining to read

Used versions (please complete the following information)

To Reproduce

  1. run token=$(curl -d "grant_type=password&client_id=f20cd2d3-682a-4568-a53e-4262ef54c8f4&client_secret=AMenuDLjVdVo4BSwi0QD54LL6NeVDEZRzEQUJ7hJOM3g4imDZBHHX0hNfKHPeQIGkskhtCmqAJtt_jm7EKq-rWw&username=ega-test-data@ebi.ac.uk&password=egarocks&scope=openid" -H "Content-Type: application/x-www-form-urlencoded" -k https://ega.ebi.ac.uk:8443/ega-openid-connect-server/token | python3 -c "import sys, json; print(json.load(sys.stdin)[\"access_token\"])")
  2. run curl -H "Authorization: Bearer $token" https://ega.ebi.ac.uk:8443/v2/files/EGAF00001770106 -o /dev/null -v
  3. check the output
  4. if not triggered, try it again at some other time.

Observed behaviour

early terminated connection

Expected behaviour

successful downloading file

Screenshots and error messages

see description