charlesdaniels / bitshuffle

BSD 3-Clause "New" or "Revised" License
5 stars 0 forks source link

Use sha256 instead of sha1 #44

Closed jyn514 closed 6 years ago

jyn514 commented 6 years ago

Currently, all hashing is done using hashlib.sha1(data).hexdigest(data). SHA1 is known to be insecure and should be updated. SHA256 should be used instead, preferrably using the SHA3 standard.

charlesdaniels commented 6 years ago

I am dubious of security considerations in the context of BitShuffle, as the hash is transmitted alongside the data. A potential attacker could modify both the data and the hash if they wanted. The hashes are intended to protect against errors while copy/pasting, or introduced during transmission. The checksum field is also not intended for indexing or use as a primary key, so collisions aren't really going to be a practical issue.

That said, I am not opposed to migrating to a different SHA version. SHA3 has a Python implementation, although I would very much prefer to use something in the Python standard library. I believe SHA256 or SHA512 would be better candidates, as hashlib supports these already.

jyn514 commented 6 years ago

Waiting on https://github.com/charlesdaniels/bitshuffle/pull/46 which abstracts hash into a function. Should be easy to address after that.

jyn514 commented 6 years ago

Waiting on https://github.com/charlesdaniels/bitshuffle/issues/50. Fortunately, sha3_256 is supported natively by hashlib.

jyn514 commented 6 years ago

Would like to push this before release goes through, since it breaks backwards compatibility.

jyn514 commented 6 years ago

Done in https://github.com/charlesdaniels/bitshuffle/commit/763c639c6edad4095555b2c77c972b3eda57f436

jyn514 commented 6 years ago

Python 2 does not support sha3_256

jyn514 commented 6 years ago

Finally done in https://github.com/charlesdaniels/bitshuffle/commit/eea3a069c8da4386cfaae843180cea643df9f1d9