FD- / RPiPlay

An open-source AirPlay mirroring server for the Raspberry Pi. Supports iOS 9 and up.
GNU General Public License v3.0
4.95k stars 354 forks source link

Use OpenSSL for elliptic curve crypto #187

Closed jasLogic closed 3 years ago

jasLogic commented 3 years ago

I saw that the README mentioned that you may want to implement OpenSSL for the elliptic curve cryptography.

I pretty much just replaced the functions from curve25519-donna and ed25519 with the appropiate OpenSSL functions and defined two new macros for the key and signature size.

I also altered the handle_error function from lib/crypto.c a bit, see the commit for details.

I hope this is what you imagined.

FD- commented 3 years ago

Yeah, that's very close to what I was imagining. Thank you very much!

Still, I'd prefer to retain some abstraction over the OpenSSL API so that we don't have all those (IMHO somewhat) cryptic EVP_* calls clutter the RPiPlay code. I'm thinking something similar to what we have for AES in crypto.c. That way, we could also keep the handle_error function private to the crypto implementation. What do you think?

BTW, did you have a chance to benchmark the performance on a Pi?

jasLogic commented 3 years ago

Sure, sounds reasonable, I will implement this.

How would you want to handle the keys? They still need to be saved in pairing_s and pairing_session_s. Should I wrap them in some struct or should I use the "raw" EVP_PKEY?

I actually didn't use the software on a Pi but on my Desktop but I can try to get my Pi up and running again. How do I benchmark the program, is there a script or something?

pallas commented 3 years ago

@jasLogic this is great, I had started to do this but got frustrated with the EVP_* interface and put a pin in it so I'm glad I don't have to revisit it myself. :) I agree that it would be nice to have a light wrapper around the OpenSSL interface, just in case we ever want to swap in something different in the future. The existing interface should be fine.

@FD- since the ECC stuff happens infrequently and is generally expensive when it does happened (i.e. compared to the AES stream) the performance shouldn't matter.

pallas commented 3 years ago

@jasLogic actually, I take it back.. it looks like this change is almost 100% confined to pairing, so if I were doing it I'd personally avoid writing a second wrapper.. pairing is essentially that wrapper. I might do something like typedef char public_key_t[X25519_KEY_SIZE]; for the only caller, which is RAOP, but my opinion isn't very strong here.

FD- commented 3 years ago

Thanks for your comments @pallas! I'm not sure how to interpret what you've written about performance, but what I meant to say was that I'd expect this to run slightly faster on the Pi than the implementations we had so far, at least that's the experience I made when migrating the AES and SHA parts over.

@jasLogic: I just wanted to get the numbers in case you changed this in an effort to improve performance, but please don't do any extra work if you haven't made any measurements :) I think I just added some primitive time comparison and printfs when I did the OpenSSL AES stuff, but I removed it immediately after.

I'd still prefer having a wrapper around the OpenSSL calls for consistency (I agree with @pallas that the existing curve25519-donna and ed25519 interfaces could work fine). For the keys, I'd suggest a wrapper struct.

Please note that I never touched this part of the code base and only had a very superficial look now. If you feel like what I suggest doesn't make sense, please let me know! I guess you're now more familiar with the pairing code than I ever was :)

pallas commented 3 years ago

What I mean is that x25519 is slow no matter the implementation but it rarely executes (only to establish the AES key), so it can't really affect overall performance. This is different than AES, which is executed often, so the speed of the implementation matters.

FD- commented 3 years ago

True. So I guess measuring the performance doesn't make much sense anyway.

jasLogic commented 3 years ago

Now that I thought about it I think I agree with @FD- about the wrapper. I also think the wrapper functions turned out pretty well.

Regarding performance: I'm not sure how bad this is but what concerns me a little are the OpenSSL contexts which are created and destroyed in most of the wrapper functions. I don't know how bad this is but if OpenSSL allocates something on the heap in the background then this could make the code a little slower. Then again maybe OpenSSL uses some optimization of the algorithms and hardware acceleration which might give a performace benefit.

FD- commented 3 years ago

@jasLogic Fantastic! Thanks for your efforts!

Does the wrapper create and destroy more OpenSSL contexts than the unwrapped versions did?

pallas commented 3 years ago

It shouldn't matter, right? I did a quick perusal and I think the only thing we call that creates a context is ed25519_verify, right? That only happens when a pairing session is finalized. We could make the ctx there static if we're sure it'll never be called from multiple threads but I'd just as soon alloc/free it every time, since x25519 is an expensive operation anyway.

FD- commented 3 years ago

Yeah, I think we're fine with how it's implemented now.

pallas commented 3 years ago

Do you want me to smoketest it?

FD- commented 3 years ago

Yeah, that'd be great!

pallas commented 3 years ago

LGTM

jasLogic commented 3 years ago

Thanks for the merge :). Just to answer your questions:

Does the wrapper create and destroy more OpenSSL contexts than the unwrapped versions did?

That depends. If the whole handshake routine is only called once for every session (which I think is the case) then only one more context is created. If the handshake is done more often it would be more.

It shouldn't matter, right? I did a quick perusal and I think the only thing we call that creates a context is ed25519_verify, right?

Most of the functions create a context. All functions that generate keys, the derive secret function and the sign and verify functions.

But as you said, this is only done when pairing and is cheap relative to the computation so it doesn't really matter.