Enhancement Request: Add `usedforsecurity` Argument to Hash Functions to Indicate Security Context Usage

gxx777 commented 6 months ago

Hello,

Thank you for maintaining the Python cryptographic library.

I am a research student with an interest in cryptographic engineering. Recently, we developed a Cryptographic APIs Misuse Detector to identify potential misuses. We have also adapted our rules for the mainstream pycryptodome library. In comparing the use of insecure hash functions between the pycryptodome library and the hashlib standard library, we identified numerous misuses in real-world applications. Unfortunately, unlike the hashlib standard library, the pycryptodome library does not have a keyword-only argument like usedforsecurity to explicitly indicate whether it is used in a security context. I believe it would be beneficial to consider adding this feature. As a reference, in the official library, there is a set keyword-only argument usedforsecurity with the default value True. A false value allows the use of insecure and blocked hashing algorithms in restricted environments.

Reference: hashlib

Changed in version 3.9: All hashlib constructors take a keyword-only argument usedforsecurity with default value True. A false value allows the use of insecure and blocked hashing algorithms in restricted environments. False indicates that the hashing algorithm is not used in a security context, e.g. as a non-cryptographic one-way compression function.

Usage

h = hashlib.new('md5',usedforsecurity=False)
h.update(b"Nobody inspects the spammish repetition")
h.hexdigest()

I am grateful for your dedication to the cryptographic library. If this enhancement is considered, I would be delighted to contribute to the Python cryptographic library.

Thank you for your time and consideration.

Legrandin commented 6 months ago

Thanks for pointing out the existence of this flag in the standard lib - I didn't realize it was added back in 2020 ...

Still, I am not convinced and will not implement the proposal, because I don't agree it is a good idea in the first place.

1) It was not introduced in hashlib for security reasons (e.g., to combat misuse) but to make compliance to FIPS easier, and especially because of OpenSSL FIPS. This is described fully in the old GH issue that proposed the flag. Simply put, it is a way to demonstrate to an auditor that the Python program will not use non-FIPS hashes for security purposes (after having somehow convinced the auditor that the developer set the flag in the right way at each place, which is dubious... - and I wonder how your Detector establishes that? Does it simply trust the flag?).

2) It is simply ugly and a bad pattern. FIPS doesn't cover only hashes, so by the same token that flag needs to be added to every single security algorithm (ciphers, MACs, etc). API calls will become much more verbose, for little value.

gxx777 commented 6 months ago

Thanks for your prompt response. As you mentioned:

The reason for setting the standard library as official due to FIPS, but this indeed serves as a good flag for auditor personnel to check whether it is in a secure context. The PyCryptodome documentation also posts warnings about the historic deprecated hashes algorithm like MD5 and SHA1, advising users against their use. To effectively promote this warning, I believe it can be achieved through this keyword-only argument.
'It is a way to demonstrate to an auditor that the Python program will not use non-FIPS hashes for security purposes.' However, due to the principle of minimum security, the default value of this parameter is True, implying that users are presumed to use it in a security context by default. It doesn't require developers to explicitly set the flag at each place. This flag is introduced just for more standardized use of hash functions.
Regarding the principle behind our detector's implementation, which is aimed at detecting the use of insecure hash functions, it is based on matching the AST call node name pattern of insecure APIs. Additionally, for the hashlib library, we also match this parameter. In our detection results, about 2.5% of developers explicitly set this parameter to False.

Legrandin commented 6 months ago

I don't see it as "a good flag for auditor personnel" though. It is a flag that was added to 1 library, for a different purposes than yours, arguably not on the best technical grounds, and which is inflexible and with limited semantic value. Since you are effectively building a static analysis tool (like Coverity for instance), I would recommend you rather use common practices for such class of tools such as having users adding comments to flag a false positive (like in this case).
You still assume that developers make the right decisions in setting this flag, and that the correct value is static, whereas it could depend on how the code is reached and with which data.
The fact remains that the API is polluted for 100% of developers.

Legrandin / pycryptodome

Enhancement Request: Add `usedforsecurity` Argument to Hash Functions to Indicate Security Context Usage #790

Reference: hashlib

Usage