python / cpython

The Python programming language
https://www.python.org
Other
62.4k stars 29.96k forks source link

explore hashlib use of the Apple CryptoKit macOS #91280

Open gpshead opened 2 years ago

gpshead commented 2 years ago
BPO 47124
Nosy @gpshead, @ronaldoussoren, @ned-deily, @corona10

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['extension-modules', 'OS-mac', 'type-feature'] title = 'explore hashlib use of the Apple CryptoKit macOS' updated_at = user = 'https://github.com/gpshead' ``` bugs.python.org fields: ```python activity = actor = 'corona10' assignee = 'none' closed = False closed_date = None closer = None components = ['Extension Modules', 'macOS'] creation = creator = 'gregory.p.smith' dependencies = [] files = [] hgrepos = [] issue_num = 47124 keywords = [] message_count = 4.0 messages = ['416032', '416111', '416131', '416165'] nosy_count = 4.0 nosy_names = ['gregory.p.smith', 'ronaldoussoren', 'ned.deily', 'corona10'] pr_nums = [] priority = 'normal' resolution = None stage = None status = 'open' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue47124' versions = [] ```

gpshead commented 2 years ago

https://developer.apple.com/documentation/cryptokit/ in macOS 10.15+

This is a common place for platform specific hardware acceleration to be exposed to the user (especially on SoCs which often have non-standard hardware - Like Apples... which is presumably why they create this).

What they offer is limited, but when present and running on a recent enough macOS, using their and SHA2 and HMAC(SHA2) implementations as well as Insecure.SHA1 is probably better than OpenSSL's. **Verify this.** It'd also allow those to be fast in a non-openssl build (as if anyone does those).

I know little about mac building and packaging and how to have something target an older OS and use a 10.15+ API. So if this winds up only being used from aarch64 macOS builds (10.15+ by definition IIRC?) that could also work.

I leave this issue for a macOS Apple API friendly person to take on.

This issue is cousin to the Linux one: https://bugs.python.org/issue47102

ronaldoussoren commented 2 years ago

A "problem" with CryptoKit is that it is a swift-only framework, which makes using those APIs harder from C code (not impossible).

The older Security framework also contains crypto APIs, but seems to have less support for modern algorithms (e.g. no support for Curve25519).

TBH I'm not sure if it is worthwhile to look into this in CPython, or that we should rely on OpenSSL for any integration (similar to Christian Heimes opinion on using the system keystore in the ssl module).

gpshead commented 2 years ago

I only pointed to that API after a brief search without looking at details (Swift? oops!). If there is one available from C that'd also make sense to consider.

The only things I expect, relevant to hashlib, that would be accelerated by OS native APIs most platforms are SHA2, maybe SHA1, and sometimes HMAC using those.

I'm in no position to judge if there is value in using them, I'm just assuming there might be. The irony is that builds without OpenSSL are rare, so unless the OS native APIs provide tangible benefits it may not matter.

(ex: the Linux APIs may allow for an efficient zero-copy variant of the new hashlib.file_digest() function)

ronaldoussoren commented 2 years ago

SecDigestTransformCreate() is probably a relevant API to look into, this seems to be supported from 10.7 until now.

A major disadvantage for us of this API is that it is a CoreFoundation API and because of that is problematic in pre-forking scenario's (that is, call in a child proces that's the result of fork-without-exec) because most if not all CoreFoundation types are not safe to use in these scenario's.

Apple also has an older crypto API, but that has been deprecated for a long time and should not be used.

ronaldoussoren commented 8 months ago

Turns out CommonCrypto is still interesting, the only hashing bits that are deprecated are related to hash functions we shouldn't have anyway (such as MD4).

Interestingly it should be easy enough to experiment with using CommonCrypto, given this comment in CommonCrypto/CommonDigest.h:

/*
 * To use the above digest functions with existing code which uses
 * the corresponding openssl functions, #define the symbol
 * COMMON_DIGEST_FOR_OPENSSL in your client code (BEFORE including
 * this file), and simply link against libSystem (or System.framework)
 * instead of libcrypto.
 *
 * You can *NOT* mix and match functions operating on a given data
 * type from the two implementations; i.e., if you do a CC_MD5_Init()
 * on a CC_MD5_CTX object, do not assume that you can do an openssl-style
 * MD5_Update() on that same context.
 */

We'd still have to ship OpenSSL though, CommonCrypto does not contain a TLS implementation. There is of course a system TLS library (in the security framework), but I'm not sure if it is worth the effort to try to implement the ssl interface on top of it. One ssl implementation based on OpenSSL is hard enough to maintain (looking in from the side lines).