openwsn-berkeley / lakers

EDHOC implemented in Rust, optimized for microcontrollers, with bindings for C and Python.
https://crates.io/crates/lakers
BSD 3-Clause "New" or "Revised" License
12 stars 10 forks source link

Role struct constructors #99

Closed chrysn closed 4 months ago

chrysn commented 9 months ago

Attempting to use edhoc-rs in a demo application, I found a few properties of the role constructors (eg. EdhocInitiator) odd:

I think I could provide patches for most of those, but these might be here for a reason. Could you explain why those are as they are, and whether those are boundary conditions set up by hacspec, or maybe just remnants of earlier development phases? If hacspec tightens things down here, do you have a roadmap for how to provide additional interfaces for constrained devices when hacspec limitations do not apply?

malishav commented 9 months ago

Thanks for the feedback. See below.

The g_i / g_r is fed in explicitly -- from an application point of view I'd expect the library to either use the platform's random number generator (or, when that is not possible, to provide a random number callback that'll pull as many bytes as needed).

Note that these are not the ephemeral keys. These are the public keys present in the corresponding CRED_I/CRED_R. For the moment, we do not parse the credentials for the sake of simplicity but rather use these values in the corresponding ECDH calculations.

The constructors expect to be fed both sides' credentials -- the responder needs to know in advance which id_cred_i and cred_i the initiator will send, whereas I'd expect EDHOC operations to work with arbitrary initiators. It'd still be up to the application to provide the right CRED_R for an ID_CRED_R, but that could be a map in the simplest case (esp. when KIDs are used), or network based in more complex ones.

This is indeed a current limitation of the API. We would be happy to accept a PR on this.

All those binary arguments are passed in hex encoded strings -- a conversion I'd prefer never to do on an embedded system.

True. I went through lib/src/lib.rs quickly and it seems byte strings would work equally well. I do no recall why the particular choice to use hex-encoded string was made. Again, happy to accept a PR on this.

Note that with the evolution of the hacspec framework, we can now use it on the vanilla Rust branch of the code. That means that we will be getting rid of the hacspec version of the code quite soon, so if you are going to work on the code base, I would suggest doing so on lib/src/lib.rs and lib/src/edhoc.rs.

chrysn commented 9 months ago

That's good to hear!

Note that these are not the ephemeral keys.

Right, I mixed those up. (So then, providing the g_peer will be up to whatever provides the cred_peer for the provided id_cred_peer).

would suggest doing so on lib/src/lib.rs and lib/src/edhoc.rs.

Those two have large mod hacspec / mod rust blocks; will these be unified, and is there a branch I can look at to see what of those survives? (In particular, will the array!() thingies still be around?)

For testing of upcoming PRs, is there anything I should do locally, or can I trust that CI will complain if I leave the hacspec parts behind somehow?

malishav commented 9 months ago

Those two have large mod hacspec / mod rust blocks; will these be unified, and is there a branch I can look at to see what of those survives? (In particular, will the array!() thingies still be around?)

Note that only lib/src/lib.rs should have rust and hacspec blocks. Code in lib/src/edhoc.rs is vanilla Rust code. @geonnave is working on a branch where hacspec parts are removed and will let you know. The idea is to completely remove the "hacspec" blocks so array!() macros will go away!

For testing of upcoming PRs, is there anything I should do locally, or can I trust that CI will complain if I leave the hacspec parts behind somehow?

The CI does all the necessary steps, but the hacspec checks will run only on the hacspec part of the code. So, for now, work on the Rust code and we can test later the hacspec aspects. (Please do not use early returns as these won't work when hacspec-checked).

chrysn commented 9 months ago

Looking at this with some evenings into the library, and looking at all the data passed in at the constructor, it seems to me that the end game of a EdhocResponder::new() might look quite different from now, in that it would create its default state because

This can all be expressed as extra arguments and return values in the right functions (and implementing it is one of my next steps). However, I'm wondering whether given the strict sequence (out-of-sequence calls make error go WrongState anyway), and given that the .state is taken and replaced anyway (also owing to hacspec's functional nature), may it not make sense to embrace typestate in the next iteration after that? (After learning some more of how this library works, it seems to me that the lib/src/lib.rs part will not be verified, so we'd be rather free in creating an idiomatic API there.)

So a mid-term proposal could look like

let a: EdhocResponder = Default::default();
let (a, ead1) = a.process_message_1(); // now a is of type EdhocResponderPhase1
let (a, m2) = a.prepare_message_2(...); // now a is of type EdhocResponderPhase2

and any error in sequence would be spotted by the type checker, instead of returning WrongState at runtime. There'd probably still be checks for the state (because the inner State object would stay the same, and we can't easily express its invariant to Rust), but some good const propagation could still make them go away completely.

chrysn commented 9 months ago

Trying around, this is going to affect edhoc.rs more, because we'll have to split up r_process_message_3 (and others if we want to decide the identity of R based on message 1's EAD). That is tricky because

  1. it'll need to process the same message twice (OK, maybe not b/c decode_plaintext_3 returns owned data, but that may not be that way forever),
  2. it's about processing processing encrypted data before they are authenticated (which is generally something to be very careful about towards the application), and
  3. we're touching part of what makes verification happen.

However, the alternative is to decide CRED_I / G_I in a callback from within r_process_message_3, which I can't imagine makes anything any better, especially because that callback may block / may require async. (For example if looking up the credential takes time b/c it happens over a network and we have a cache miss).

Let's ignore 1 for the time being -- it may be a lot of data but it's bounded as long as we have limits on the EAD size we can process. I'd like to sketch a proposal first -- this would replace what was previously a let (state, prk_out) = r_process_message_3(state, message_3, id_cred_i_expected, cred_i_expected, g_i)? call:

let (state, mac_3, id_cred_i, ead_3_untrusted) = r_process_message_3a(state, message_3)?;
let (cred_i, g_i) = look_up_credentials(&id_cred_i); // may also use ead_3_untrusted
let (state, prk_out) = r_process_message_3b(state, mac_3, id_cred_i, ead_3_untrusted, cred_i, g_i)?;
// As that was successful, we may now also rename ead_3_untrusted to ead_3

Assembling the check from a mac_3 and the ead_3_untrusted is the alternative to passing in the message_3 once more -- none of which I'm really happy with, given that all things might go wrong if the user sends in the wrong mac_3 / ead_3_untrusted. The alternative is lugging them around with the state (which gets moved around anyway), but I'm not sure how well this would fly with hax:

let state_and_more = r_process_message_3a(state, message_3)?; // different type from `state`
let (cred_i, g_i) = look_up_credentials(state_and_more.id_cred_i()); // may also use accessor for ead_3_untrusted
let (state, prk_out) = r_process_message_3b(state_and_more, cred_i, g_i)?;
// ead_3 is now lost, but had we cloned it out, we could now trust it
// (If we were willing to carry it around, the latest state could also keep it; the compiler should
// throw it away anyway if unused if LTO is used)

I have a soft preference for approach 2 (which also moves the State even of the edhoc module into a typestate direction), but both should work, especially given that the lib module will be wrapping it anyway. So the larger question for me is: Which of those works for the verification?

(One way of addressing the question is: Where does the verifier even know what r_process_message_3 is supposed to do? I only found two mentions in the code, in lib's lib.rs and edhoc.rs.)

malishav commented 9 months ago

Looking at this with some evenings into the library, and looking at all the data passed in at the constructor, it seems to me that the end game of a EdhocResponder::new() might look quite different from now, in that it would create its default state because

  • r, cred_r and id_cred_r may be chosen as a result of processing message 1 (there's no draft on that yet, but if we ever did something like encrypted client hello, that's where it'd land)
  • id_cred_i will be learned in message 3
  • cred_i and g_i will be provided by the application after message 3 processing This can all be expressed as extra arguments and return values in the right functions (and implementing it is one of my next steps).

Agreed.

However, I'm wondering whether given the strict sequence (out-of-sequence calls make error go WrongState anyway), and given that the .state is taken and replaced anyway (also owing to hacspec's functional nature), may it not make sense to embrace typestate in the next iteration after that? (After learning some more of how this library works, it seems to me that the lib/src/lib.rs part will not be verified, so we'd be rather free in creating an idiomatic API there.)

Indeed, we organized the library in this manner for exactly the reason you outline. We can and should provide an idiomatic API in lib/src/lib.rs that won't get hascpec-verified, but everything inside of lib/src/edhoc.rs goes under the hacspec scrutiny. Using typestate indeed seems to make sense, as long as it can all be contained in lib/src/lib.rs.

So a mid-term proposal could look like

let a: EdhocResponder = Default::default();
let (a, ead1) = a.process_message_1(); // now a is of type EdhocResponderPhase1
let (a, m2) = a.prepare_message_2(...); // now a is of type EdhocResponderPhase2

This looks great!

There'd probably still be checks for the state (because the inner State object would stay the same, and we can't easily express its invariant to Rust), but some good const propagation could still make them go away completely.

Just to make sure we are on the same page here, when you refer to the "inner" State object, do you mean the value of state passed to the hacspec-verified functions? Don't we still need those checks to verify the correct progress of hte state machine? e.g. how will typeset make sure that we do not process the same message twice if such a message was received over the wire?

chrysn commented 9 months ago

Don't we still need those checks to verify the correct progress of hte state machine?

They will, but may use results from the outside in const propagation. Ideally, we could inform the compiler that an invariant needs to be upheld -- say, PublicStateAfterMessage2 { state_for_edhocrs: State } might declare that this type has an invariant that self.state_for_edhocrs == EdhocState::WaitMessage3. We don't have that in Rust now, but tools emulating this may place the assert already at the accessor level (so by the time r_process_message_3 is called, the compiler knows that branch can't occur b/c of the earlier assert) or may even use assume versions (with due care: check at store time, if state !=WaitMessage3 { unsafe { unsafe_unreachable() } }) that moves all the checks to one point in time.

malishav commented 9 months ago

Trying around, this is going to affect edhoc.rs more, because we'll have to split up r_process_message_3 (and others if we want to decide the identity of R based on message 1's EAD). That is tricky because

  1. it'll need to process the same message twice (OK, maybe not b/c decode_plaintext_3 returns owned data, but that may not be that way forever),
  2. it's about processing processing encrypted data before they are authenticated (which is generally something to be very careful about towards the application), and
  3. we're touching part of what makes verification happen.

However, the alternative is to decide CRED_I / G_I in a callback from within r_process_message_3, which I can't imagine makes anything any better, especially because that callback may block / may require async. (For example if looking up the credential takes time b/c it happens over a network and we have a cache miss).

Let's ignore 1 for the time being -- it may be a lot of data but it's bounded as long as we have limits on the EAD size we can process. I'd like to sketch a proposal first -- this would replace what was previously a let (state, prk_out) = r_process_message_3(state, message_3, id_cred_i_expected, cred_i_expected, g_i)? call:

let (state, mac_3, id_cred_i, ead_3_untrusted) = r_process_message_3a(state, message_3)?;
let (cred_i, g_i) = look_up_credentials(&id_cred_i); // may also use ead_3_untrusted
let (state, prk_out) = r_process_message_3b(state, mac_3, id_cred_i, ead_3_untrusted, cred_i, g_i)?;
// As that was successful, we may now also rename ead_3_untrusted to ead_3

Assembling the check from a mac_3 and the ead_3_untrusted is the alternative to passing in the message_3 once more -- none of which I'm really happy with, given that all things might go wrong if the user sends in the wrong mac_3 / ead_3_untrusted. The alternative is lugging them around with the state (which gets moved around anyway), but I'm not sure how well this would fly with hax:

let state_and_more = r_process_message_3a(state, message_3)?; // different type from `state`
let (cred_i, g_i) = look_up_credentials(state_and_more.id_cred_i()); // may also use accessor for ead_3_untrusted
let (state, prk_out) = r_process_message_3b(state_and_more, cred_i, g_i)?;
// ead_3 is now lost, but had we cloned it out, we could now trust it
// (If we were willing to carry it around, the latest state could also keep it; the compiler should
// throw it away anyway if unused if LTO is used)

I have a soft preference for approach 2 (which also moves the State even of the edhoc module into a typestate direction), but both should work, especially given that the lib module will be wrapping it anyway. So the larger question for me is: Which of those works for the verification?

(One way of addressing the question is: Where does the verifier even know what r_process_message_3 is supposed to do? I only found two mentions in the code, in lib's lib.rs and edhoc.rs.)

Before we discard the callback option, let me understand the pitfalls a bit better. I understand that retrieving CRED_I over the network may take time, but can we not just block until that is resolved? What would be an issue with that?

Other than that, on splitting the message 3 processing: I think that approach 1 is dangerous as it lets the application handle internal state of the protocol. I think it is out of the question.

As per approach 2, IIUC, you would essentially be adding mac_3 and ead_3 to state, and adding an additional state in the state machine to expect this (implemented with typestate). That seems fine as an approach to me, although I would ideally prefer to keep message_3 processing in one piece.

It would be great if @karthikbhargavan could chime in here.

chrysn commented 9 months ago

just block until that is resolved

A lot of embedded Rust environments don't block, but run on async instead. While it's certainly possible to have versions of the r_process_message_3 function around for blocking and async evaluation, I doubt that'd fly well with the verification, and I also don't think it's pretty (which is really an odd term to say that I think it's not good practice but can't express the arguments well yet).

adding mac_3 and ead_3 to state

To understand better the impact of this I'd like to add to the open questions: Why is the state struct that's passed around between the phases of the protocol of the same type all the time? AFAICT, every function takes it only in one state and on exit it is in one state (unless there's an error), so why not typestate in the first place? (That'd make carrying mac_3 and ead_3 less hard).

malishav commented 9 months ago

To understand better the impact of this I'd like to add to the open questions: Why is the state struct that's passed around between the phases of the protocol of the same type all the time? AFAICT, every function takes it only in one state and on exit it is in one state (unless there's an error), so why not typestate in the first place? (That'd make carrying mac_3 and ead_3 less hard).

There is no good explanation for this. We discussed previously splitting it but have not implemented it yet.

geonnave commented 6 months ago

Given the merge of #128 and #174, can we close this @chrysn?

chrysn commented 4 months ago

The original points have been addressed. Thanks, closing this.