python / cpython

The Python programming language
https://www.python.org
Other
63.72k stars 30.53k forks source link

add crypto routines to stdlib #53244

Closed e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 closed 11 years ago

e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 commented 14 years ago
BPO 8998
Nosy @malemburg, @loewis, @birkenfeld, @gpshead, @ncoghlan, @pitrou, @vstinner, @giampaolo, @tiran, @mcepl, @merwok, @davidmalcolm, @durban

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = None closed_at = created_at = labels = ['type-feature', 'library'] title = 'add crypto routines to stdlib' updated_at = user = 'https://bugs.python.org/debatem1' ``` bugs.python.org fields: ```python activity = actor = 'gregory.p.smith' assignee = 'none' closed = True closed_date = closer = 'gregory.p.smith' components = ['Library (Lib)'] creation = creator = 'debatem1' dependencies = [] files = [] hgrepos = [] issue_num = 8998 keywords = [] message_count = 95.0 messages = ['107813', '107814', '107816', '107818', '107821', '107822', '107823', '107828', '107830', '107834', '107846', '107868', '107872', '107873', '107882', '107885', '108076', '108077', '108078', '108079', '108080', '108081', '108082', '108083', '108084', '108087', '108088', '108091', '108092', '108093', '108094', '108157', '108171', '108224', '108233', '108921', '108923', '108926', '108931', '108935', '115039', '115044', '115142', '116727', '116730', '116731', '116732', '116733', '116734', '116735', '116736', '116737', '116739', '116743', '116756', '116835', '116838', '116842', '116844', '116845', '116850', '116856', '116860', '116869', '116870', '116871', '116879', '116994', '116995', '116997', '117034', '117035', '117039', '117045', '117078', '117086', '117089', '117092', '117094', '117101', '117121', '117127', '118650', '118651', '118655', '118656', '118657', '118658', '118663', '120306', '194208', '194209', '194210', '194211', '194212'] nosy_count = 21.0 nosy_names = ['lemburg', 'loewis', 'georg.brandl', 'gregory.p.smith', 'exarkun', 'ncoghlan', 'pitrou', 'vstinner', 'giampaolo.rodola', 'christian.heimes', 'lorph', 'heikki', 'mcepl', 'eric.araujo', 'debatem1', 'dmalcolm', 'daniel.urban', 'mcrute', 'jsamuel', 'devin', 'madison.may'] pr_nums = [] priority = 'normal' resolution = 'later' stage = None status = 'closed' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue8998' versions = ['Python 3.4'] ```

e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 commented 14 years ago

Python's hashlib and ssl modules currently leverage OpenSSL to provide developers with access to cryptographic hash and TLS routines, but encryption/decryption and signature/verification support are still missing. I propose the addition of an easy-to-use crypto module modeled after Evpy[0] to remedy this.

e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 commented 14 years ago

apologies, forgot the link:

[0] http://gitorious.org/evpy

61337411-43fc-4a9c-b8d5-4060aede66d0 commented 14 years ago

Assuming you are willing to contribute evpy (and have the rights to do so, i.e. all of the code is truly yours): what's the user acceptance of the code?

In particular, what do authors of competing OpenSSL wrappers (like M2Crypto) or other Python crypto packages (like pycrypto) think about this idea?

pitrou commented 14 years ago

and have the rights to do so, i.e. all of the code is truly yours

Is it really required, or is a non-copyleft liberal license (MIT-like or BSD-like) enough?

61337411-43fc-4a9c-b8d5-4060aede66d0 commented 14 years ago

> and have the rights to do so, i.e. all of the code is truly yours

Is it really required, or is a non-copyleft liberal license (MIT-like or BSD-like) enough?

The contributor would have to sign a contributor agreement, giving the PSF the right to relicense under the PSF license (or anything they please to relicense under). If the contributor only has a BSD license (from his contributors), he has no right to contribute the code under the contributor agreement (i.e. he, himself, wouldn't have the right to relicense).

pitrou commented 14 years ago

The contributor would have to sign a contributor agreement, giving the PSF the right to relicense under the PSF license (or anything they please to relicense under). If the contributor only has a BSD license (from his contributors), he has no right to contribute the code under the contributor agreement (i.e. he, himself, wouldn't have the right to relicense).

I always forget about that :/

e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 commented 14 years ago

On Mon, Jun 14, 2010 at 3:09 PM, Martin v. Löwis \report@bugs.python.org\ wrote:

Martin v. Löwis \martin@v.loewis.de\ added the comment:

Assuming you are willing to contribute evpy (and have the rights to do so, i.e. all of the code is truly yours): what's the user acceptance of the code?

I'd be willing to, but I see more utility in contributing specific elements of its functionality to the stdlib. Obviously the code is mine, and I can relicense as needed if necessary.

As for your second question, I don't believe it sees much in the way of use.

In particular, what do authors of competing OpenSSL wrappers (like M2Crypto) or other Python crypto packages (like pycrypto) think about this idea?

Evpy and M2Crypto have very different goals. M2Crypto seeks to be a complete wrapper for OpenSSL, which we don't, and also uses SWIG, which disqualifies it from consideration for the stdlib.

I don't know what the pycrypto folks would say about evpy, but I admit to being very wary of that project- it appears to have been constructed in a way which lends itself well to academic exercise rather than practical use by nonexperts, and have had multiple occasions to correct its dire misuse.

Geremy Condra

61337411-43fc-4a9c-b8d5-4060aede66d0 commented 14 years ago

Evpy and M2Crypto have very different goals. M2Crypto seeks to be a complete wrapper for OpenSSL, which we don't, and also uses SWIG, which disqualifies it from consideration for the stdlib.

I don't know what the pycrypto folks would say about evpy, but I admit to being very wary of that project- it appears to have been constructed in a way which lends itself well to academic exercise rather than practical use by nonexperts, and have had multiple occasions to correct its dire misuse.

That isn't really my question; it's the other way 'round: what do *they* (i.e. the respective authors) say about evpy? In the absence of actual user input, support for inclusion of it by these experts would be a valuable indication that this specific library should be included. Likewise, objective resistance may lead to significant changes before inclusion, or to rejection. In the absence of both user support and expert opinions, I'd ask for a PEP.

e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 commented 14 years ago

On Mon, Jun 14, 2010 at 3:37 PM, Martin v. Löwis \report@bugs.python.org\ wrote:

Martin v. Löwis \martin@v.loewis.de\ added the comment:

> Evpy and M2Crypto have very different goals. M2Crypto seeks to be a > complete wrapper for OpenSSL, which we don't, and also uses SWIG, > which disqualifies it from consideration for the stdlib. > > I don't know what the pycrypto folks would say about evpy, but I admit > to being very wary of that project- it appears to have been constructed > in a way which lends itself well to academic exercise rather than > practical use by nonexperts, and have had multiple occasions to correct > its dire misuse.

That isn't really my question; it's the other way 'round: what do *they* (i.e. the respective authors) say about evpy? In the absence of actual user input, support for inclusion of it by these experts would be a valuable indication that this specific library should be included. Likewise, objective resistance may lead to significant changes before inclusion, or to rejection. In the absence of both user support and expert opinions, I'd ask for a PEP.

I have no idea, and as I said earlier in the mailing list, I'm willing to contribute the code, make changes as requested, and maintain it- but I have no interest in or skill with the political footwork the process demands. I like to think that if this is as widely desired as it is asked for on python-list that a champion will sooner or later emerge.

Geremy Condra

pitrou commented 14 years ago

Le lundi 14 juin 2010 à 22:48 +0000, geremy condra a écrit :

I have no idea, and as I said earlier in the mailing list, I'm willing to contribute the code, make changes as requested, and maintain it- but I have no interest in or skill with the political footwork the process demands. I like to think that if this is as widely desired as it is asked for on python-list that a champion will sooner or later emerge.

For the record, Gregory P Smith (current maintainer of hashlib -- if I'm not mistaken :-)), Jean-Paul Calderone (maintainer of pyOpenSSL) and Heikki Toivonen (maintainer of m2crypto) have been added to the nosy list for this issue. As for the built-in ssl module, I've been doing most of the maintenance work on it lately.

e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 commented 14 years ago

On Mon, Jun 14, 2010 at 6:51 PM, Antoine Pitrou \report@bugs.python.org\ wrote:

Antoine Pitrou \pitrou@free.fr\ added the comment:

Le lundi 14 juin 2010 à 22:48 +0000, geremy condra a écrit : > > I have no idea, and as I said earlier in the mailing list, I'm > willing to contribute the code, make changes as requested, > and maintain it- but I have no interest in or skill with the > political footwork the process demands. I like to think that > if this is as widely desired as it is asked for on python-list > that a champion will sooner or later emerge.

For the record, Gregory P Smith (current maintainer of hashlib -- if I'm not mistaken :-)), Jean-Paul Calderone (maintainer of pyOpenSSL) and Heikki Toivonen (maintainer of m2crypto) have been added to the nosy list for this issue. As for the built-in ssl module, I've been doing most of the maintenance work on it lately.

Lot of people. If nobody minds I'm going to go ahead and post a link to this on python-crypto, since a lot of the interface emerged out of discussions that group had at pycon.

I'd also urge folks who are interested in this to be vocal about whether they like the API and where they'd like to see changes- I'm open to suggestions and, as noted in the mailing list, am reimplementing in C, so this is a good time to be talking about where you'd like to see things go.

Geremy Condra

pitrou commented 14 years ago

I've taken a quick look at the source tree (there doesn't seem to be any separate docs) and here is my opinion:

By the way, the use of function signature annotations to mirror C APIs as Python APIs through ctypes is nice, perhaps you should upload it as a separate library on PyPI :)

e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 commented 14 years ago

On Tue, Jun 15, 2010 at 9:21 AM, Antoine Pitrou \report@bugs.python.org\ wrote:

Antoine Pitrou \pitrou@free.fr\ added the comment:

I've taken a quick look at the source tree (there doesn't seem to be any separate docs) and here is my opinion:

  • the evp.py API is too low-level (it's a one-to-one mapping to the OpenSSL C API); we would want at least some kind of object-oriented abstraction around the basic concepts (such as in the hashlib and ssl modules) rather than passing opaque pointers around

evp.py is mostly for internal use (to map the openssl calls into Python) and won't exist in the rewrite- most of the people who would want to use that should really be using M2Crypto or similar.

  • the other APIs (cipher.py, envelope.py, signature.py) look conversely too high-level, since they focus on specific use cases and make some arbitrary choices for the user (for example, envelope.py imposes AES-192)

The goals of the library are simplicity and ease of use. I've frequently found that out of fear of making incorrect choices, people will simply decide not to use crypto at all, or that they make incredibly stupid choices like using RSA without padding. I'd be willing to add in the option to alter those options via keyword arguments if it became a major point of contention, but in general I think its better for those who "just want to encrypt something" to have a lot of those decisions made for them. The specific decision you're talking about was made because while AES-256 has a bigger number at the end, its key schedule appears weaker in light of recent attacks.

By the way, the use of function signature annotations to mirror C APIs as Python APIs through ctypes is nice, perhaps you should upload it as a separate library on PyPI :)

I've posted them as recipes on ASPN ([0] and [1]). I used a similar technique and the JNI to mechanically wrap the Android libraries (Java) for access from Python, and it worked pretty well. Looking at the data from pypi, ease-of-use things don't seem to see a lot of use, but if you think I ought to then I could go ahead and do that.

Geremy Condra

[0] http://code.activestate.com/recipes/576731-c-function-decorator/ [1] http://code.activestate.com/recipes/576734-c-struct-decorator/

b4955a8f-c284-45ca-970a-c8b30359ba50 commented 14 years ago

AFAIK, what the stdlib needs is a high-level crypto module, analogous to hashlib

pitrou commented 14 years ago

Le mardi 15 juin 2010 à 14:49 +0000, geremy condra a écrit :

The goals of the library are simplicity and ease of use. I've frequently found that out of fear of making incorrect choices, people will simply decide not to use crypto at all, or that they make incredibly stupid choices like using RSA without padding. I'd be willing to add in the option to alter those options via keyword arguments if it became a major point of contention, but in general I think its better for those who "just want to encrypt something" to have a lot of those decisions made for them. The specific decision you're talking about was made because while AES-256 has a bigger number at the end, its key schedule appears weaker in light of recent attacks.

While it's fine to perhaps detect and warn about insecure use, I don't think the API should be too directive (for inclusion in the stdlib anyway). Most (if not all) stdlib modules don't impose any specific policy but instead provide building blocks for users to address their specific needs. Directive APIs should probably be left to third-party libraries (which can of course build on the primitives provided by the stdlib). Also, some uses of crypto functions can be to interoperate with existing cryptographic protocols, and for that you need a fine-grained control over algorithmic options.

Do note that the docs can be as educating as needed; they can include suggestions, warnings and even small recipes.

As for default argument values, the problem is that we're then stuck with them (for compatibility). It means that if e.g. AES-192 gets compromised, Python will promote an API which by default is insecure and dangerous to use. Again, giving equal access to various ciphers and then providing guidance in the documentation would be a better compromise.

e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 commented 14 years ago

On Tue, Jun 15, 2010 at 9:49 AM, Antoine Pitrou \report@bugs.python.org\ wrote:

Antoine Pitrou \pitrou@free.fr\ added the comment:

Le mardi 15 juin 2010 à 14:49 +0000, geremy condra a écrit : > The goals of the library are simplicity and ease of use. I've > frequently found that out of fear of making incorrect choices, people > will simply decide not to use crypto at all, or that they make > incredibly stupid choices like using RSA without padding. I'd be > willing to add in the option to alter those options via keyword > arguments if it became a major point of contention, but in general I > think its better for those who "just want to encrypt something" to > have a lot of those decisions made for them. The specific decision > you're talking about was made because while AES-256 has a bigger > number at the end, its key schedule appears weaker in light of recent > attacks.

While it's fine to perhaps detect and warn about insecure use, I don't think the API should be too directive (for inclusion in the stdlib anyway). Most (if not all) stdlib modules don't impose any specific policy but instead provide building blocks for users to address their specific needs. Directive APIs should probably be left to third-party libraries (which can of course build on the primitives provided by the stdlib). Also, some uses of crypto functions can be to interoperate with existing cryptographic protocols, and for that you need a fine-grained control over algorithmic options.

I'm not clear on how a crypto library is supposed to detect insecure use short of simply not allowing suspicious things. Maybe you have some ideas there?

As for building-block type systems, like I say they have their place, particularly where interoperating with existing systems is a concern, and I don't want to come across as though I don't respect projects like M2Crypto- I just think that most developers don't need that level of complexity and aren't prepared to invest the time to learn how to get what they want out of it. That's where something like evpy shines.

Do note that the docs can be as educating as needed; they can include suggestions, warnings and even small recipes.

I'm reasonably sure that there aren't enough docs in the world to stop people from using OpenSSL to live dangerously. Evpy you could get people not to be completely stupid with, at least a large portion of the time.

As for default argument values, the problem is that we're then stuck with them (for compatibility). It means that if e.g. AES-192 gets compromised, Python will promote an API which by default is insecure and dangerous to use. Again, giving equal access to various ciphers and then providing guidance in the documentation would be a better compromise.

I would be enormously surprised if a weakness in AES-192 was found that weakened it to the point where it would actually constitute bad advice, assuming that you made all the other right decisions. Having said that, it might be a good idea to put a version switch in that allowed you to specify compatibility modes, just in case.

Geremy Condra

5673b80d-7854-4b39-a467-eb682bce79ae commented 14 years ago

More or less random opinions on things presented before:

e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 commented 14 years ago

On Thu, Jun 17, 2010 at 8:01 PM, Heikki Toivonen \report@bugs.python.org\ wrote:

Heikki Toivonen \hjtoi-bugzilla@comcast.net\ added the comment:

More or less random opinions on things presented before:

 * I prefer having secure defaults to over documentation, because, well, people don't read documentation.

Wholeheartedly agree.

 * If not secure defaults, then pointing out in documentation the secure way AND providing examples that always show the secure way of doing things.

Not as big a fan, honestly. Most domain-specific projects can count on those reading the documentation to have a good idea of what it is that they actually want to do; in crypto this does not seem to be the case very often, and that's a tricky problem to fix that in the scope of a recipe or piece of documentation.

 * I can't comment on aes 192 vs 256 as I have not really kept up with that, but it would be good to ask the opinion(s) of the real experts in this field before choosing the defaults/recommending them. Of course, if you can point to an article where the experts already voice their (recent) recommendations, fine.

http://eprint.iacr.org/2009/317.pdf http://eprint.iacr.org/2009/374.pdf http://eprint.iacr.org/2009/241.pdf

Bruce Schneier's take: http://www.schneier.com/blog/archives/2009/07/another_new_aes.html

The only cryptosystem/padding/etc choice in evpy I'm uncomfortable with (at the moment ;) ) is the use of ad-hoc padding rather than OAEP, and I only do that because that's what evp does. Of course, if you have any other concerns I'd appreciate hearing about them.

 * When I have thought about Python crypto in the stdlib, I've considered modeling it after hashlib, so you would get cipher = cryptolib.AES(bits=192, ...) etc. (Caveat: haven't thought it through.)

I'm not opposed to this, but I suspect that focusing on what the algorithms are for rather than what they are reduces the cognitive load somewhat. Perhaps a two-tier api?

 * I'd prefer if the crypto API didn't become OpenSSL specific (like the SSL one is), which would theoretically allow switching in other crypto provider(s).

I agree in theory, although I'm not sure how important this is likely to be in practice.

  The library should make it easy to do the most common operations with as few steps as practically possible.   It would be nice if the library could provide the means to tweak lower level things if you needed to. Unfortunately this has a tendency to get messy quick, because crypto stuff tends to have lots of options to tweak.

100% agree. If you have any ideas- or if anyone else does- on how best to do this, I'd be very happy to discuss it.

Geremy Condra

61337411-43fc-4a9c-b8d5-4060aede66d0 commented 14 years ago

> * I'd prefer if the crypto API didn't become OpenSSL specific (like the SSL one is), which would theoretically allow switching in other crypto provider(s).

I agree in theory, although I'm not sure how important this is likely to be in practice.

I always wanted to drop OpenSSL from the Windows binaries, and use MS CryptoAPI instead.

7ad756fc-130f-4966-b479-145798a9f250 commented 14 years ago
  • When I have thought about Python crypto in the stdlib, I've considered modeling it after hashlib, so you would get cipher = cryptolib.AES(bits=192, ...) etc. (Caveat: haven't thought it through.)

I think there is a relevant PEP: PEP-272 -- API for Block Encryption Algorithms v1.0 (http://www.python.org/dev/peps/pep-0272/ ) It describes an API somewhat similar to hashlib.

e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 commented 14 years ago
On Fri, Jun 18, 2010 at 2:19 AM, Martin v. Löwis <report@bugs.python.org> wrote:
>
> Martin v. Löwis <martin@v.loewis.de> added the comment:
>
>>>   * I'd prefer if the crypto API didn't become OpenSSL specific (like the SSL one is), which would theoretically allow switching in other crypto provider(s).
>>
>> I agree in theory, although I'm not sure how important this is likely
>> to be in practice.
>
> I always wanted to drop OpenSSL from the Windows binaries, and use MS
> CryptoAPI instead.

My familiarity with the CryptoAPI is limited, but I think doing this for something like evpy would be possible. I also suspect that doing it for anything that exposed much more than evpy does would grow very quickly in complexity where it was possible at all.

Geremy Condra

e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 commented 14 years ago

On Fri, Jun 18, 2010 at 2:39 AM, Daniel Urban \report@bugs.python.org\ wrote:

Daniel Urban \urban.dani+py@gmail.com\ added the comment:

>  * When I have thought about Python crypto in the stdlib, I've considered modeling it after hashlib, so you would get cipher = cryptolib.AES(bits=192, ...) etc. (Caveat: haven't thought it through.)

I think there is a relevant PEP: PEP-272 -- API for Block Encryption Algorithms v1.0 (http://www.python.org/dev/peps/pep-0272/ ) It describes an API somewhat similar to hashlib.

Again, I'm not entirely opposed to this, but I think it represents a lower-level API than most developers can really be safely trusted to handle.

Geremy Condra

pitrou commented 14 years ago

Le vendredi 18 juin 2010 à 06:46 +0000, geremy condra a écrit :

geremy condra \debatem1@gmail.com\ added the comment:

On Fri, Jun 18, 2010 at 2:39 AM, Daniel Urban \report@bugs.python.org\ wrote: > > Daniel Urban \urban.dani+py@gmail.com\ added the comment: > >> * When I have thought about Python crypto in the stdlib, I've considered modeling it after hashlib, so you would get cipher = cryptolib.AES(bits=192, ...) etc. (Caveat: haven't thought it through.) > > I think there is a relevant PEP: PEP-272 -- API for Block Encryption Algorithms v1.0 (http://www.python.org/dev/peps/pep-0272/ ) > It describes an API somewhat similar to hashlib.

Again, I'm not entirely opposed to this, but I think it represents a lower-level API than most developers can really be safely trusted to handle.

If there is a contention or disagreement between different API styles, it may be wise to seek opinions on python-dev or python-ideas.

I'd point out that the "ssl" module itself seems to have evolved from a trivial wrapper API (in the 2.5 docs I can only find a single 3-parameter function, socket.ssl()) to a more comprehensive API in 3.2, because people ultimately need the functionalities. (and yet the ssl API in 3.2 is still much less featureful than M2Crypto or pyOpenSSL are)

e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 commented 14 years ago

On Fri, Jun 18, 2010 at 3:09 AM, Antoine Pitrou \report@bugs.python.org\ wrote:

Antoine Pitrou \pitrou@free.fr\ added the comment:

Le vendredi 18 juin 2010 à 06:46 +0000, geremy condra a écrit : > geremy condra \debatem1@gmail.com\ added the comment: > > On Fri, Jun 18, 2010 at 2:39 AM, Daniel Urban \report@bugs.python.org\ wrote: > > > > Daniel Urban \urban.dani+py@gmail.com\ added the comment: > > > >>  * When I have thought about Python crypto in the stdlib, I've considered modeling it after hashlib, so you would get cipher = cryptolib.AES(bits=192, ...) etc. (Caveat: haven't thought it through.) > > > > I think there is a relevant PEP: PEP-272 -- API for Block Encryption Algorithms v1.0 (http://www.python.org/dev/peps/pep-0272/ ) > > It describes an API somewhat similar to hashlib. > > Again, I'm not entirely opposed to this, but I think it represents a > lower-level API than most developers can really be safely trusted to > handle.

If there is a contention or disagreement between different API styles, it may be wise to seek opinions on python-dev or python-ideas.

I'm not sure there's a disagreement here except what the top-level API should be. If someone is really determined to use the lower-level API I have no issue with it, and (within the bounds of time and ability) am willing to write the code to support it.

I'd point out that the "ssl" module itself seems to have evolved from a trivial wrapper API (in the 2.5 docs I can only find a single 3-parameter function, socket.ssl()) to a more comprehensive API in 3.2, because people ultimately need the functionalities. (and yet the ssl API in 3.2 is still much less featureful than M2Crypto or pyOpenSSL are)

I'm not sure I'm understanding what you mean. Are you saying it should start as a comprehensive wrapper because that's what ssl is headed towards or that it should start simply because such functionality will evolve organically as the need arises?

Geremy Condra

pitrou commented 14 years ago

> I'd point out that the "ssl" module itself seems to have evolved from a > trivial wrapper API (in the 2.5 docs I can only find a single > 3-parameter function, socket.ssl()) to a more comprehensive API in 3.2, > because people ultimately need the functionalities. > (and yet the ssl API in 3.2 is still much less featureful than M2Crypto > or pyOpenSSL are)

I'm not sure I'm understanding what you mean. Are you saying it should start as a comprehensive wrapper because that's what ssl is headed towards or that it should start simply because such functionality will evolve organically as the need arises?

The former. Evolving organically has quite a few issues, because the original API may be far from ideal to build on, and yet you have to ensure compatibility with that API. ("comprehensive" doesn't have to equate "exhaustive" of course. But any API which tries to simplify things too much might also be a roadblock when it comes to exposing more features)

e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 commented 14 years ago

On Fri, Jun 18, 2010 at 3:28 AM, Antoine Pitrou \report@bugs.python.org\ wrote:

Antoine Pitrou \pitrou@free.fr\ added the comment:

> > I'd point out that the "ssl" module itself seems to have evolved from a > > trivial wrapper API (in the 2.5 docs I can only find a single > > 3-parameter function, socket.ssl()) to a more comprehensive API in 3.2, > > because people ultimately need the functionalities. > > (and yet the ssl API in 3.2 is still much less featureful than M2Crypto > > or pyOpenSSL are) > > I'm not sure I'm understanding what you mean. Are you saying it should > start as a comprehensive wrapper because that's what ssl is headed > towards or that it should start simply because such functionality will > evolve organically as the need arises?

The former. Evolving organically has quite a few issues, because the original API may be far from ideal to build on, and yet you have to ensure compatibility with that API. ("comprehensive" doesn't have to equate "exhaustive" of course. But any API which tries to simplify things too much might also be a roadblock when it comes to exposing more features)

Well, like I say, I'm willing to contribute what time and ability allow. Are you thinking of adding a comprehensive wrapper to the ssl module?

Geremy Condra

pitrou commented 14 years ago

Well, like I say, I'm willing to contribute what time and ability allow. Are you thinking of adding a comprehensive wrapper to the ssl module?

Hmm, no, I was just providing an existing datapoint to help us deciding on a crypto API. AFAICT this issue hasn't much to do with the ssl module, except perhaps for (positive or negative) inspiration ;-) (and except that it will also - most likely - interface with OpenSSL)

e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 commented 14 years ago

On Fri, Jun 18, 2010 at 4:53 AM, Antoine Pitrou \report@bugs.python.org\ wrote:

Antoine Pitrou \pitrou@free.fr\ added the comment:

> Well, like I say, I'm willing to contribute what time and ability > allow. Are you thinking of adding a comprehensive wrapper to the ssl > module?

Hmm, no, I was just providing an existing datapoint to help us deciding on a crypto API.  AFAICT this issue hasn't much to do with the ssl module, except perhaps for (positive or negative) inspiration ;-) (and except that it will also - most likely - interface with OpenSSL)

The question in my mind then is whether anybody willing to contribute time knows enough about the CryptoAPI, or NSS, or what-have-you, to help craft an API that makes the waterfall model look manageable. If not, I would suggest that we focus on defining and building a lower-level interface along the lines of the PEP noted earlier, integrating that with evpy, and getting it in shape to go into the stdlib. At that point, if demand arises for an even lower level API, we already have the wrapping functions for a lot of the calls into OpenSSL or whatever, and we can build on those in the aforementioned evolutionary fashion. If somebody does, then perhaps a four-tiered model makes more sense, with the bottom one being the raw wrappers around the various libs, the second from the bottom being compatibility shims, and the top two matching the other proposal. Having said that, it's not something I could take on alone.

Geremy Condra

pitrou commented 14 years ago

I would suggest that we focus on defining and building a lower-level interface along the lines of the PEP noted earlier, integrating that with evpy, and getting it in shape to go into the stdlib.

That sounds reasonable to me. (although I would be also content with the lower-level interface alone :-))

If somebody does, then perhaps a four-tiered model makes more sense, with the bottom one being the raw wrappers around the various libs, the second from the bottom being compatibility shims, and the top two matching the other proposal.

That sounds much too complicated.

e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 commented 14 years ago

On Fri, Jun 18, 2010 at 5:37 AM, Antoine Pitrou \report@bugs.python.org\ wrote:

Antoine Pitrou \pitrou@free.fr\ added the comment:

> I would suggest that we focus on defining and building a > lower-level interface along the lines of the PEP noted earlier, > integrating that with evpy, and getting it in shape to go into the > stdlib.

That sounds reasonable to me.

Great, I'm thinking more-or-less the API proposed in PEP-272- the exception I'm thinking of is that 'strings' should be substituted for 'bytes'- for AES and DES. It gets trickier when talking about public key crypto, though. Perhaps something along the lines of RSA.new(public_key=None, private_key=None,...), with the resulting object supporting encrypt/decrypt/sign/verify operations?

(although I would be also content with the lower-level interface alone :-))

> If somebody does, then perhaps a four-tiered > model makes more sense, with the bottom one being the raw wrappers > around the various libs, the second from the bottom being > compatibility shims, and the top two matching the other proposal.

That sounds much too complicated.

If we're likely to have openssl taken out from under us it could save us a lot of hassle to plan for that up front. If not, then why worry, and ISTM we should go the simpler route.

Geremy Condra

pitrou commented 14 years ago

Great, I'm thinking more-or-less the API proposed in PEP-272- the exception I'm thinking of is that 'strings' should be substituted for 'bytes'- for AES and DES. It gets trickier when talking about public key crypto, though. Perhaps something along the lines of RSA.new(public_key=None, private_key=None,...), with the resulting object supporting encrypt/decrypt/sign/verify operations?

I don't have any opinion right now. I think a concrete proposal should be initiated and we can iterate from that. (that's assuming other people agree on the principle, of course)

If we're likely to have openssl taken out from under us it could save us a lot of hassle to plan for that up front.

It doesn't seem very likely in the middle term. In particular, the ssl module itself is quite tied to OpenSSL's socket wrapping semantics (including error codes and non-blocking behaviour), so OpenSSL will probably still be required for SSL sockets.

e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 commented 14 years ago

On Fri, Jun 18, 2010 at 6:05 AM, Antoine Pitrou \report@bugs.python.org\ wrote:

Antoine Pitrou \pitrou@free.fr\ added the comment:

> Great, I'm thinking more-or-less the API proposed in PEP-272- the > exception I'm thinking of is that 'strings' should be substituted for > 'bytes'- for AES and DES. It gets trickier when talking about public > key crypto, though. Perhaps something along the lines of > RSA.new(public_key=None, private_key=None,...), with the resulting > object supporting encrypt/decrypt/sign/verify operations?

I don't have any opinion right now. I think a concrete proposal should be initiated and we can iterate from that. (that's assuming other people agree on the principle, of course)

I assume that by "a concrete proposal" you're talking about code? Or API docs? Also, what more needs to be done to ensure that other people agree on the principle?

> If we're likely to have openssl taken out from under us it could save > us a lot of hassle to plan for that up front.

It doesn't seem very likely in the middle term. In particular, the ssl module itself is quite tied to OpenSSL's socket wrapping semantics (including error codes and non-blocking behaviour), so OpenSSL will probably still be required for SSL sockets.

I'm fine with doing it the simpler way and adding in support for other systems PRN. Having said that, Martin, if this is high priority for you let me know.

Geremy Condra

pitrou commented 14 years ago

Le samedi 19 juin 2010 à 00:55 +0000, geremy condra a écrit :

geremy condra \debatem1@gmail.com\ added the comment:

On Fri, Jun 18, 2010 at 6:05 AM, Antoine Pitrou \report@bugs.python.org\ wrote: > > Antoine Pitrou \pitrou@free.fr\ added the comment: > >> Great, I'm thinking more-or-less the API proposed in PEP-272- the >> exception I'm thinking of is that 'strings' should be substituted for >> 'bytes'- for AES and DES. It gets trickier when talking about public >> key crypto, though. Perhaps something along the lines of >> RSA.new(public_key=None, private_key=None,...), with the resulting >> object supporting encrypt/decrypt/sign/verify operations? > > I don't have any opinion right now. I think a concrete proposal should > be initiated and we can iterate from that. > (that's assuming other people agree on the principle, of course)

I assume that by "a concrete proposal" you're talking about code? Or API docs? Also, what more needs to be done to ensure that other people agree on the principle?

I was thinking about a PEP. Of course, you are free to reuse existing PEP content for that :)

e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 commented 14 years ago

On Sat, Jun 19, 2010 at 7:52 AM, Antoine Pitrou \report@bugs.python.org\ wrote:

Antoine Pitrou \pitrou@free.fr\ added the comment:

Le samedi 19 juin 2010 à 00:55 +0000, geremy condra a écrit : > geremy condra \debatem1@gmail.com\ added the comment: > > On Fri, Jun 18, 2010 at 6:05 AM, Antoine Pitrou \report@bugs.python.org\ wrote: > > > > Antoine Pitrou \pitrou@free.fr\ added the comment: > > > >> Great, I'm thinking more-or-less the API proposed in PEP-272- the > >> exception I'm thinking of is that 'strings' should be substituted for > >> 'bytes'- for AES and DES. It gets trickier when talking about public > >> key crypto, though. Perhaps something along the lines of > >> RSA.new(public_key=None, private_key=None,...), with the resulting > >> object supporting encrypt/decrypt/sign/verify operations? > > > > I don't have any opinion right now. I think a concrete proposal should > > be initiated and we can iterate from that. > > (that's assuming other people agree on the principle, of course) > > I assume that by "a concrete proposal" you're talking about code? Or > API docs? Also, what more needs to be done to ensure that other people > agree on the principle?

I was thinking about a PEP. Of course, you are free to reuse existing PEP content for that :)

Ok. I've gone ahead and put together kind of a map for what I think the basic structure of the library is going to look like. Let me know what you think, and once we're done with that we can proceed into PEP land.

crypto API \========== Variables message, key, salt, iv, ciphertext, and signature are of type bytes. Variables public_key and private_key are DER-encoded bytes. Variable bitlength is an integer.

Note that we deviate from the standard in PEP-272 in several ways:

* arguments are generally bytes rather than strings
* ciphers do not accept the 'counter', 'rounds', or 'segment_size' args

Layer 1 -------

Symmetric Ciphers crypto.cipher.encrypt(message, key) -> (salt, iv, ciphertext) depends on: crypto.keys.strengthen_password crypto.AES.new crypto.AES.encrypt raises: crypto.cipher.EncryptionError

crypto.cipher.decrypt(salt, iv, ciphertext, key) -\> message
    depends on:
        crypto.AES.new
        crypto.AES.decrypt
    raises:
        crypto.cipher.DecryptionError

Envelope Encryption crypto.envelope.encrypt(message, public_key) -> (iv, aes_key, ciphertext) depends on: crypto.keys.random_key crypto.AES.new crypto.AES.encrypt crypto.RSA.new crypto.RSA.encrypt raises: crypto.envelope.EncryptionError

crypto.envelope.decrypt(iv, aes_key, ciphertext, private_key) -\> message
    depends on:
        crypto.AES.new
        crypto.AES.decrypt
        crypto.RSA.new
        crypto.RSA.decrypt
    raises:
        crypto.envelope.DecryptionError

Digital Signatures crypto.signature.sign(message, private_key) -> signature depends on: hashlib.SHA512.new hashlib.SHA512.update hashlib.SHA512.digest crypto.RSA.new crypto.RSA.sign raises: crypto.signature.SigningError

    crypto.signature.verify(message, signature, public_key)
        depens on:
            hashlib.SHA512.new
            hashlib.SHA512.update
            hashlib.SHA512.digest
            crypto.RSA.new
            crypto.RSA.verify

Layer 2 -------

Utilities crypto.keys.strengthen_password(password) -> key depends on: openssl: RAND_bytes, EVP_get_digest_by_name, EVP_bytes_to_key raises: crypto.keys.KeyGenerationError

Symmetric Encryption crypto._cipher_object

        crypto._cipher_object.CipherObject._ctx = openssl context | None
        crypto._cipher_object.CipherObject._cipher = openssl cipher | None
        crypto._cipher_object.CipherObject._key = bytes | None
    CipherObject.encrypt(self, data) -\> ciphertext
        depends on:
            crypto.\_cipher_object.CipherObject.encrypt_init
            crypto.\_cipher_object.CipherObject.encrypt_update
            crypto.\_cipher_object.CipherObject.encrypt_finalize
        raises:
            crypto.\_cipher_object.EncryptError

    CipherObject.encrypt_init() -\> None
        depends on:
            openssl: EVP_EncryptInit_ex
        raises:
            crypto.\_cipher_object.EncryptInitError

    CipherObject.encrypt_update
        depends on:
            openssl: EVP_EncryptUpdate_ex
        raises:
            crypto.\_cipher_object.EncryptUpdateError

    CipherObject.encrypt_finalize
        depends on:
            openssl: EVP_EncryptFinal_ex
        raises:
            crypto.\_cipher_object.FinalizeError

    CipherObject.decrypt(self, ciphertext) -\> message
        depends on:
            crypto.\_cipher_object.CipherObject.decrypt_init
            crypto.\_cipher_object.CipherObject.decrypt_update
            crypto.\_cipher_object.CipherObject.decrypt_finalize
        raises:
            crypto.\_cipher_object.DecryptError

    CipherObject.decrypt_init() -\> None
        depends on:
            openssl: EVP_DecryptInit_ex
        raises:
            crypto.\_cipher_object.DecryptInitError

    CipherObject.decrypt_update
        depends on:
            openssl: EVP_DecryptUpdate_ex
        raises:
            crypto.\_cipher_object.DecryptUpdateError

    CipherObject.decrypt_finalize
        depends on:
            openssl: EVP_DecryptFinal_ex
        raises:
            crypto.\_cipher_object.DecryptFinalizeError

crypto.AES
    crypto.AES.new(key, mode, IV=None) -\> cipher_object

crypto.DES
    crypto.DES.new(key, mode, IV=None) -\> cipher_object

Asymmetric Encryption crypto.RSA crypto.RSA.new(public_key=None, private_key=None, padding=4) -> crypto.RSA.RSA depends on: openssl: d2i_RSAPublicKey, d2i_RSAPrivateKey raises: crypto.RSA.KeyError crypto.RSA.InitializationError

    crypto.RSA.generate_keypair(bitlength) -\> public_key, private_key
        depends on:
            openssl: RSA_generate_key, i2d_RSAPublicKey, RSA_free
        raises:
            crypto.RSA.KeygenError

crypt.RSA.RSA
    crypto.RSA.RSA.\_public_key = openssl RSA key | None
    crypto.RSA.RSA.\_private_key = openssl RSA key | None
    crypto.RSA.RSA.\_padding_type = integer

    crypto.RSA.RSA.encrypt(self, data) -\> ciphertext
        depends on:
            openssl: RSA_size, RSA_public_encrypt
        raises:
            crypto.RSA.EncryptionError

    crypto.RSA.RSA.decrypt(self, ciphertext) -\> message
        depends on:
            openssl: RSA_size, RSA_private_decrypt
        raises:
            crypto.RSA.DecryptionError

    crypto.RSA.RSA.sign(self, hash) -\> signature
        depends on:
            openssl: RSA_size, RSA_sign
        raises:
            crypto.RSA.SigningError

    crypto.RSA.RSA.verify(self, hash, signature) -\> True | False
        depends on:
            openssl: RSA_size, RSA_verify
        raises:
            crypto.RSA.VerificationError

Geremy Condra

pitrou commented 14 years ago

Le dimanche 20 juin 2010 à 06:30 +0000, geremy condra a écrit :

crypto API \========== [...]

For presentation purposes, I would order layers by abstraction levem: that is, "layer 1" should be the lower-level layer and "layer 2" the upper-level.

I think all further discussion should happen on the PEP itself.

malemburg commented 14 years ago

Apart from the question of API, please also include a section on the legal implications this move would have on Python in the PEP.

We currently only include OpenSSL in the Windows installers and (for some reason) don't pay much attention to the implications this has (the fact is not mentioned on the download page and the Windows installer doesn't show the required OpenSSL old-style BSD attribution).

If we are to require OpenSSL or some other crypto lib, possibly even our own (e.g. pycrypto) for all platforms, then we could no longer just ignore the fact that crypto code is subject to strong legislation in many countries of the world.

pitrou commented 14 years ago

If we are to require OpenSSL or some other crypto lib,

We already depend on OpenSSL for both hashlib and ssl, this proposal wouldn't change anything in this regard.

malemburg commented 14 years ago

Antoine Pitrou wrote:

Antoine Pitrou \pitrou@free.fr\ added the comment:

> If we are to require OpenSSL or some other crypto lib,

We already depend on OpenSSL for both hashlib and ssl, this proposal wouldn't change anything in this regard.

hashlib can still works without OpenSSL and hash algorithms don't fall under crypto laws. ssl doesn't work without OpenSSL, but also doesn't require adding any crypto code to the stdlib.

The main point that needs to be addressed is shipping Python with crypto code. If OpenSSL is optionally used, we're fine, but if we start shipping crypto code, things are more contrived.

See http://rechten.uvt.nl/koops/cryptolaw/ for a survey.

We're hosting the Python software on servers in The Netherlands, so have to follow the Wassenaar Arrangement if we include crypto code. Fortunately, that agreement includes a clause which pretty much exempts open source crypto code from export regulations.

However, users of Python downloading installers with crypto software would import and use it in their resp. countries and that may get them into trouble, so they need to be warned if we decide to ship crypto code with Python.

They way I understand Geremy's suggestion is to just include a wrapper for OpenSSL, so that's fine. The PEP should include a mention of the above to argue against putting e.g. pycrypto into the stdlib (not because it's poor software, much to the contrary, only because it causes lots of problems for our users and the developers).

malemburg commented 14 years ago

Marc-Andre Lemburg wrote:

We currently only include OpenSSL in the Windows installers and (for some reason) don't pay much attention to the implications this has (the fact is not mentioned on the download page and the Windows installer doesn't show the required OpenSSL old-style BSD attribution).

I've opened bpo-9119 to address this part.

e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 commented 14 years ago
On Tue, Jun 29, 2010 at 2:25 PM, Marc-Andre Lemburg
<report@bugs.python.org> wrote:
>
> Marc-Andre Lemburg <mal@egenix.com> added the comment:
>
> Antoine Pitrou wrote:
>>
>> Antoine Pitrou <pitrou@free.fr> added the comment:
>>
>>> If we are to require OpenSSL or some other crypto lib,
>>
>> We already depend on OpenSSL for both hashlib and ssl, this proposal
>> wouldn't change anything in this regard.
>
> hashlib can still works without OpenSSL and hash algorithms don't
> fall under crypto laws. ssl doesn't work without OpenSSL, but also
> doesn't require adding any crypto code to the stdlib.

This won't change the status quo, as my code simply leverages OpenSSL rather than being an independent implementation.

The main point that needs to be addressed is shipping Python with crypto code. If OpenSSL is optionally used, we're fine, but if we start shipping crypto code, things are more contrived.

As I say, we're doing things exactly how they're already done. Python would not be shipping any more crypto code with this module than it already does.

See http://rechten.uvt.nl/koops/cryptolaw/ for a survey.

I've looked over it before and didn't notice anything glaringly applicable, outside of the Windows situation. IANAL, though.

We're hosting the Python software on servers in The Netherlands, so have to follow the Wassenaar Arrangement if we include crypto code. Fortunately, that agreement includes a clause which pretty much exempts open source crypto code from export regulations.

Again, this seems to me something more relevant to the OpenSSL folks than to us.

However, users of Python downloading installers with crypto software would import and use it in their resp. countries and that may get them into trouble, so they need to be warned if we decide to ship crypto code with Python.

Your suggestion about a warning for Windows downloads seems appropriate. I'm not sure how much more than that needs to be done, though.

They way I understand Geremy's suggestion is to just include a wrapper for OpenSSL, so that's fine. The PEP should include a mention of the above to argue against putting e.g. pycrypto into the stdlib (not because it's poor software, much to the contrary, only because it causes lots of problems for our users and the developers).

I'll add mention of the concern over export laws, but it's probably not feasible to get similar security properties out of any reimplementation that could be crafted in a reasonable time anyway.

As a note, I intend to have prototype code ready at approximately the same time as the PEP, so, time permitting, you should be able to play with this before too long.

Geremy Condra

merwok commented 14 years ago

Geremy, could you kindly give a status update? Thanks

e3cd80c0-0d1c-412a-b4b4-5e542ba3ba75 commented 14 years ago

On Thu, Aug 26, 2010 at 3:49 PM, Éric Araujo \report@bugs.python.org\ wrote:

Éric Araujo \merwok@netwok.org\ added the comment:

Geremy, could you kindly give a status update? Thanks

The block and stream cipher parts of the library (RC4, AES, and DES) are functionally complete. I'm putting the finishing touches on envelope encryption this week, but would greatly appreciate assistance in demonstrating the library's capabilities- one person is helping with AES encryption in ziplib, but other examples would be very helpful.

Geremy Condra

merwok commented 14 years ago

Thanks for the reply, the situation looks good!

I’m an interested outsider with practically no knowledge of encryption except from a high-level GPG user viewpoint, so I can’t help with tests, but I could give a hand to documentation.

b993299a-f990-4540-b708-2c574aa9e4db commented 14 years ago

May I recommend using libtomcrypt instead of openssl because of the advertising problem outlined here?

http://bugs.python.org/issue9119

In my opinion, libtomcrypt is easier to use and cleaner. It compiles on Windows without requiring Perl, and is free of the advertising clause in OpenSSL since it is public domain.

http://libtom.org/?page=features&newsitems=5&whatfile=crypt

pitrou commented 14 years ago

May I recommend using libtomcrypt instead of openssl because of the advertising problem outlined here?

Changing libraries because of an "advertising problem" doesn't sound reasonable. The latter is much more easily solved than the former.

Besides, libtomcrypt doesn't seem to provide SSL or TLS support (at least the Web page you linked to doesn't say so), so OpenSSL would still be needed for the ssl module.

8726d1eb-a365-45b6-b81d-c75988975e5a commented 14 years ago

How about nss? As a bonus, this would also avoid making more work for Fedora (\http://fedoraproject.org/wiki/FedoraCryptoConsolidation\).

pitrou commented 14 years ago

How about nss? As a bonus, this would also avoid making more work for Fedora (\http://fedoraproject.org/wiki/FedoraCryptoConsolidation\).

Well, similar question: what will it bring and who will do the work? :) (Fedora perhaps?)

davidmalcolm commented 14 years ago

On Fri, 2010-09-17 at 23:11 +0000, Antoine Pitrou wrote:

Antoine Pitrou \pitrou@free.fr\ added the comment:

> How about nss? As a bonus, this would also avoid making more work for > Fedora (\http://fedoraproject.org/wiki/FedoraCryptoConsolidation\).

Well, similar question: what will it bring and who will do the work? :) (Fedora perhaps?)

Possibly me - if you'll take my patches :)

8726d1eb-a365-45b6-b81d-c75988975e5a commented 14 years ago

What it will bring: APIs which aren't absolutely insane; full SSL support; RSA, DSA, ECDSA, Diffie-Hellman, EC Diffie-Hellman, AES, Triple DES, DES, RC2, RC4, SHA-1, SHA-256, SHA-384, SHA-512, MD2, MD5, HMAC: Common cryptographic algorithms used in public-key and symmetric-key cryptography; simplified FIPS 140 validation; better licensing (MPL).

I'm interested in stuff based on nss, but I definitely won't promise to do the work. Fortunately dmalcolm seems to be on top of that. :)

davidmalcolm commented 14 years ago

I should note that I can't touch anything to do with Elliptic Curve crypto. I don't know if I can comment on the reasons for that.