Duke-GCB / D4S2

Web service to facilitate notification and transfer of projects in DukeDS
MIT License
0 stars 0 forks source link

Some zip files fail to download #218

Closed johnbradley closed 5 years ago

johnbradley commented 5 years ago

In production I am unable to download a zip file. I get a 59 byte zip file. Error from d4s2-download container:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/gunicorn/workers/async.py", line 66, in handle
    six.reraise(*sys.exc_info())
  File "/usr/local/lib/python3.6/site-packages/gunicorn/six.py", line 625, in reraise
    raise value
  File "/usr/local/lib/python3.6/site-packages/gunicorn/workers/async.py", line 56, in handle
    self.handle_request(listener_name, req, client, addr)
  File "/usr/local/lib/python3.6/site-packages/gunicorn/workers/ggevent.py", line 152, in handle_request
    super(GeventWorker, self).handle_request(*args)
  File "/usr/local/lib/python3.6/site-packages/gunicorn/workers/async.py", line 129, in handle_request
    six.reraise(*sys.exc_info())
  File "/usr/local/lib/python3.6/site-packages/gunicorn/six.py", line 625, in reraise
    raise value
  File "/usr/local/lib/python3.6/site-packages/gunicorn/workers/async.py", line 114, in handle_request
    for item in respiter:
  File "/usr/local/lib/python3.6/site-packages/zipstream/__init__.py", line 182, in __iter__
    for data in self.__write(**kwargs):
  File "/usr/local/lib/python3.6/site-packages/zipstream/__init__.py", line 311, in __write
    for buf in iterable:
  File "/app/download_service/zipbuilder.py", line 116, in fetch
    response = requests.get(url, stream=True)
  File "/usr/local/lib/python3.6/site-packages/requests/api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests/adapters.py", line 514, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='..dukes3..', port=443): Max retries exceeded with url: ..download_url.. (Caused by SSLError(SSLError(1, '[SSL: DH_KEY_TOO_SMALL] dh key too small (_ssl.c:852)'),))
johnbradley commented 5 years ago

On dev instance I was able to change /etc/ssl/openssl.conf CipherString to DEFAULT@SECLEVEL=1 and it started working.

johnbradley commented 5 years ago

Without the above openssl config change I can reproduce via curl within the dev d4s2-download container:

curl "https://...s3..url"
curl: (35) error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small

On the d4s2 host (outside of docker) the curl command succeeds. Inside the dev d4s2-download container openssl version is OpenSSL 1.1.1c 28 May 2019. Outside the container openssl version is OpenSSL 1.0.2g 1 Mar 2016.

johnbradley commented 5 years ago

Link suggesting to remove CipherString from /etc/ssl/openssl.conf on openssl 1.1.1. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=907788

I think this might need a fix on the s3 server side to use a long DH key: https://weakdh.org/

dleehr commented 5 years ago

I updated the title because some downloads continue to succeed. @johnbradley determined that files hosted on the S3 server are failing but files hosted from the Swift server are succeeding.

So at this point, we're investigating why downloading from the S3 server is suddenly failing. Was there a server-side change? Did we build a new d4s2 image that included a newer openssl?

dleehr commented 5 years ago

There's also mechanisms to set SSL default ciphers from python: https://stackoverflow.com/a/41041028/595085

requests.packages.urllib3.util.ssl_.DEFAULT_CIPHERS += 'HIGH:!DH:!aNULL'
johnbradley commented 5 years ago

This problem can be reproduced running locally with just docker. To do so get a download url for an s3 backed file: https://api.dataservice.duke.edu/apiexplorer#!/files/getApiV1FilesIdUrl

Run a docker command using python:3.6.9 to curl the url:

$ docker run -it python:3.6.9 curl "..s3url..."
curl: (35) error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small

With python:3.6.8 this error does not occur and the file downloads correctly. The D4S2 docker image uses python:3.6. https://github.com/Duke-GCB/D4S2/blob/c8b06dfa72f0225a943f2e2c7a6e64bbc50f81a3/Dockerfile#L1

Which gets ends up with python:3.6.9.

I think the reason this recently broke is we were running a container using a pre 3.6.9 python docker image. When we deployed a recent release (1.3.0 I think ) this rebuilt the D4S2 image and got the latest 3.6 python docker image.

dleehr commented 5 years ago

This problem can be reproduced running locally with just docker. To do so get a download url for an s3 backed file: https://api.dataservice.duke.edu/apiexplorer#!/files/getApiV1FilesIdUrl

Since the cipher negotiation is host-based, you don't even need a real file URL - just a host and port:

$ docker run python:3.6.8 curl https://HOSTNAME/
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   165    0   165    0     0   2424      0 --:--:-- --:--:-- --:--:--  2462
<Error><Code>MissingSecurityHeader</Code><Message>Your request was missing a required header.</Message><RequestId>0a0ca910:16641f20d29:14420d:eea</RequestId></Error>
$ docker run python:3.6.9 curl https://HOSTNAME/
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (35) error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small

Run a docker command using python:3.6.9 to curl the url:

$ docker run -it python:3.6.9 curl "..s3url..."
curl: (35) error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small

With python:3.6.8 this error does not occur and the file downloads correctly. The D4S2 docker image uses python:3.6.

Let's use 3.6.8 and deploy that for now.

https://github.com/Duke-GCB/D4S2/blob/c8b06dfa72f0225a943f2e2c7a6e64bbc50f81a3/Dockerfile#L1

Which gets ends up with python:3.6.9.

I think the reason this recently broke is we were running a container using a pre 3.6.9 python docker image. When we deployed a recent release (1.3.0 I think ) this rebuilt the D4S2 image and got the latest 3.6 python docker image.