Closed tiran closed 4 years ago
The distutils upload command creates a MD5 digest of the file content. This is not compatible with systems with systems that run under a strict security policy that blocks MD5.
Possible fixes are:
hashlib.md5(content)
failsDoes PyPI support other digests, e.g. SHA2-256 digest?
Charis pointed me to https://github.com/pypa/warehouse/issues/681 / https://github.com/pypa/warehouse/pull/891
Does PyPI support other digests, e.g. SHA2-256 digest?
There is a simple and a complicated answer to this.
The simple answer is yes, PyPI supports uploads with any combination of MD5, SHA256, and blake2_256 (blake2b with a 256 digest, no personalization or key). It will also compute all 3 on an upload on it's own and verify that they match any provided hashes and to fill in any missing hashes.
The more complicated answer is the upload API is an old API from long before we started documenting and standardizing them, so when you start talking about non PyPI implementations of that API, what they support is kind of a big who knows.
More to the problem at hand:
We don't rely on this hash for security (We couldn't, it comes in the exact same payload as the artifact itself from the exact same source, someone who can modify the artifact en route can modify the hash too). So the inclusion of MD5 is not a concern.
Removing it *might* break non-PyPI servers that attempted to implement this API and assumed it was a mandatory field (though I do not have any a priori knowledge of this being the case).
Adding additional hashes *might* break non-PyPI servers that assumed what distutils used to send was all it would ever send (this is unlikely though, most web tools ignore unknown form fields).
I looked into what twine is doing here, and it appears it is sending md5, sha256, and blake2_256 hashes all along with every request. However if FIPS mode has disabled MD5 it just skips generating and sending MD5 (but still sends the other two) and it appears it's done this for 2+ years.
It's probably safe to just mimc what twine is doing here, sending all 3 hashes, skip MD5 if it's unavailable.
There is also https://github.com/pypa/warehouse/pull/888
So I would assume it's safe it change the digest to sha256.
Thanks for your elaborate explanation, Donald!
I have implemented your proposal in PR 20260.
New changeset e572c7f6dbe5397153803eab256e4a4ca3384f80 by Christian Heimes in branch 'master': bpo-40698: Improve distutils upload hash digests (GH-20260) https://github.com/python/cpython/commit/e572c7f6dbe5397153803eab256e4a4ca3384f80
New changeset f541a371a5e608517314a106012e0c19739d2d02 by Miss Islington (bot) in branch '3.9': bpo-40698: Improve distutils upload hash digests (GH-20260) https://github.com/python/cpython/commit/f541a371a5e608517314a106012e0c19739d2d02
Thanks Charis and Donald!
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at =
created_at =
labels = ['type-bug', 'library', '3.9', '3.10']
title = 'distutils.command.upload md5_digest'
updated_at =
user = 'https://github.com/tiran'
```
bugs.python.org fields:
```python
activity =
actor = 'christian.heimes'
assignee = 'none'
closed = True
closed_date =
closer = 'christian.heimes'
components = ['Library (Lib)']
creation =
creator = 'christian.heimes'
dependencies = []
files = []
hgrepos = []
issue_num = 40698
keywords = ['patch']
message_count = 8.0
messages = ['369442', '369446', '369447', '369448', '369452', '369457', '369458', '369459']
nosy_count = 6.0
nosy_names = ['gregory.p.smith', 'christian.heimes', 'eric.araujo', 'dstufft', 'cstratak', 'miss-islington']
pr_nums = ['20260', '20261']
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue40698'
versions = ['Python 3.9', 'Python 3.10']
```