oduwsdl / MementoEmbed

A service that provides archive-aware oEmbed-compatible embeddable surrogates (social cards, thumbnails, etc.) for archived web pages (mementos).
MIT License
15 stars 3 forks source link

Handle untrusted certificates #93

Open shawnmjones opened 6 years ago

shawnmjones commented 6 years ago

When trying the URI https://www.cs.odu.edu, which is a valid URI-R, MementoEmbed displays the error MementoEmbed could not reach the server to download https://www.cs.odu.edu.

In response, the application logs this message:

[2018-07-07 00:44:45,922] WARNING in __init__: The server for URI-M https://www.cs.odu.edu could not be reached, details: HTTPSConnectionPool(host='www.cs.odu.edu', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:841)'),))

This is because the certificate authority is not registered with certifi, the package that requests uses to do certificate verification. The certifi package gets its certificates from the Mozilla Included CA Certificate List.

The application has no issues with HTTPS URIs with domains that correspond to trusted certificates (e.g., https://www.google.com, https://www.odu.edu, https://www.washingtonpost.com).

Certificates can be ignored by passing verify=False to requests.get. Here is an example from the requests documentation:

>>> requests.get('https://kennethreitz.org', verify=False)
<Response [200]>

This is easy to centralize thanks to changes included from pull request #92.

At a minimum, the error message displayed to the user should change and I am already working on an interface update.

The question is, should we ignore all certificate verification issues?

shawnmjones commented 6 years ago

Maybe we make this configurable by the administrator, or, even better, we allow the user to specify it as an option when requesting a surrogate.