Closed tarcieri closed 4 years ago
I haven't had the chance to use generic-array
and typenum
yet (except as a consumer of crypto libraries) and my knowledge of them and const generics in general is pretty limited, but I understand we need them to build stuff that is generic over something like a key size, or in our case the tag size. They don't look too horrible to me, and since it's the best way of doing things right now I'm completely fine with it.
I saw the default implementations when I checked out the trait, it's pretty neat! And thanks for the the link to the example, that was very instructive to get an idea of how it's going to look when it's finished.
It's too bad we're going to lose detail in the error, but I understand the reasoning behind it and if that's the price for making the crate usable generically, it's well worth it. The std::error::Error
implementation is something I only thought of last-minute, because I saw some other crates doing it and it makes sense. If you can get it into the aead
crate, that would be great.
I really like where this is going, great job so far!
Nice! I'll finish it out, then
Also, opened a PR to add an optional std::error::Error
impl for aead::Error
:
Removed WIP. The tests are now passing locally for me (and the doc tests pass with --all-features
)
Sidebar: it might be good to call the CcmMode
type AesCcm
since it's specialized to AES, or make it generic around the cipher
type and make AesCcm
a type alias.
Excellent!
Sidebar: it might be good to call the
CcmMode
typeAesCcm
since it's specialized to AES, or make it generic around the cipher type and make AesCcm a type alias.
I renamed CcmMode
to AesCcm
, good idea.
Also, I publicly exported the CcmTagSize
marker trait, because that makes it show up in the documentation and it's easy to see for which tag sizes it's implemented (= which tag sizes are valid). Do you think that makes sense, or is there a reason why you didn't export it?
Otherwise, the code looks (very) good to me and the tests check out. To further test it, I updated a small embedded project of mine to use this version and do some analysis. It's one where I can use allocation, so it only uses the encrypt
and decrypt
methods. For binary size, there's barely any difference. Interestingly, it's even slightly smaller than before, which is great. But the real surprise was stack usage. I did a static analysis of the LLVM IR with the call-stack tool, to get a rough idea of the memory requirements. Of course it can only look at the stack, but since I use the heap very sparingly, that's the vast majority of the memory.
So I looked at a function, which indirectly uses AES-CCM through a library that uses it. There's some inlining going on which I can't prevent without plastering the library with #[inline(never)]
, so to call-stack
it looks as if the CcmMode::new
function is called directly in the function, for example. And for some reason, the new version uses 1760 bytes less. Can you imagine where these savings come from? And that's not all. I don't really see where exactly it happens because of the inlining, but there are another 4872 bytes less on the stack somewhere in code that is called by this function. The only thing that is different is the AES-CCM implementation. Because I use the encrypt
and decrypt
methods, my ciphertext and plain text are allocated on the heap, but that can't be the reason. The buffers I used in the previous version were heap-allocated too, and the length of my plaintexts is too short to make a difference anyway. It's curious stuff, which I could probably get to the bottom of by building a separate example that only does AES-CCM and nothing else, so my results are not polluted. I might do that if I find the time. Just thought it could be worth posting, in case you have an inkling of how a small difference in the new code could exaplain that.
Also, I publicly exported the CcmTagSize marker trait, because that makes it show up in the documentation and it's easy to see for which tag sizes it's implemented (= which tag sizes are valid). Do you think that makes sense, or is there a reason why you didn't export it?
Nope, just an oversight.
And for some reason, the new version uses 1760 bytes less. Can you imagine where these savings come from?
If you only have one invocation and it's inlining, it could be saving you the overhead of setting up the call site and associated C stack, but that's just a guess.
If you only have one invocation and it's inlining, it could be saving you the overhead of setting up the call site and associated C stack, but that's just a guess.
Ok, I guess I'll try and look into it some more. In any case, I've only observed positive effects, so we're all good.
From your side, do you think everything's ready to merge in the current state and for me to release it as 0.4.0?
Yep, looks good on my end
I built a standalone test where I managed to inhibit most inlining. For every function, the first number is its the full stack usage (local + that of all functions it calls) and the second one is that of just its local variables. This is a function encrypting and decrypting something with the old version:
16744 6536 aes_test::encrypt1
0: 10208 8664 aes_ccm::ccm::CcmMode::new
1: 512 104 aes_ccm::ccm::CcmMode::decrypt_verify
2: 496 88 aes_ccm::ccm::CcmMode::generate_encrypt
3: 288 48 alloc::vec::Vec<T>::with_capacity
4: 264 8 core::result::Result<T,E>::unwrap::h3076ded6e4a3354d
5: 256 0 core::result::Result<T,E>::unwrap::h37fafb6e29ae071b
6: 248 8 core::ptr::real_drop_in_place::h862ef114a5dead20
7: 200 32 core::panicking::panic_fmt
8: 24 0 core::cmp::impls::<impl core::cmp::PartialEq<&B> for &A>::eq
And here's the same thing with the new version:
11776 3328 aes_test::encrypt1
0: 8448 6904 <aes_ccm::ccm::AesCcm<TagSize> as aead::NewAead>::new
1: 544 120 aead::Aead::decrypt
2: 544 120 aead::Aead::encrypt
3: 272 0 core::ptr::real_drop_in_place::h2dcc130d85d46944
4: 256 0 core::result::Result<T,E>::unwrap::h81a1198cbaa19f98
5: 200 32 core::panicking::panic_fmt
6: 24 8 core::cmp::impls::<impl core::cmp::PartialEq<&B> for &A>::eq
Overall it uses 4968 bytes less stack, of which the local usage makes up 3208 and the remaining difference of 1760 bytes comes from the CcmMode::new
/AesCcm::new
function, which is now less demanding.
Thank you so much for your work! It's been great to have you come and implement this, making the crate that much better.
Awesome, thanks!
Doesn't compile yet, but curious about the general direction.The current
aead
traits heavily leveragegeneric-array
andtypenum
. Are you ok with that? They're a bit awful for now, but const generics seem like they're shaping up and may hopefully be available in 2020.To fit
mlen
into theAead
traits, I made it a generic parameter. It still enforces the valid set of tag sizes, but using the type system via aTagSize
marker trait which is only impl'd for the valid sizes.Note that there are default impls of the rest of the methods of the
Aead
trait (e.g.encrypt
,encrypt_in_place
) which you get "for free" with any cipher with a postfix tag, which AES-CCM appears to be.Using in-place encryption in heapless environments requires the
heapless
crate, but is hopefully a bit more ergonomic than the current API. See here for an example:https://docs.rs/chacha20poly1305/0.3.0/chacha20poly1305/#in-place-usage-eliminates-alloc-requirement
Finally, it's using the upstream
aead::Error
type, which is fully opaque. I hope that's ok: we prefer it for preventing potential sidechannel leakage. Note it doesn't (yet) have an optionalimpl std::error::Error
which is an oversight I'll open an upstream PR for.