boltlabs-inc / tss-ecdsa

An implementation of a threshold ECDSA signature scheme

Other

12 stars 5 forks source link

Design approach to allow caller to maintain main storage #167

Closed marsella closed 1 year ago

marsella commented 1 year ago

We need to allow the caller to control storage of public and secret material between instantiations of subprotocols. This includes:

Public key shares from all parties. Generated in keygen
The participant's own private key share. Generated in keygen
Paillier encryption keys and ring-Pedersen parameters from all parties. Generated in aux-info
The participant's own Paillier decryption key. Generated in aux-info
Presign records. Generated in presign and consumed+deleted (one at a time) during signing

Currently, these are maintained in the main_storage field of a Participant. They are stored and retrieved correctly throughout the protocol via the public API in the storage module.

Requirements and comments

Right now, the storage implementation doesn't do any hardening for the storage of secrets. Should that be handled internally or passed up to the caller? If the latter, we should be explicit about danger of insecurely storing the two types of private data and make recommendations (?) to the caller.

Completion criteria

[x] Design an approach to allow the caller to specify how they want to instantiate this storage
[x] Define follow-up issues for implementation

amaloz commented 1 year ago

Here are some initial thoughts from looking over the Participant code.

It looks like every Participant retains its own id and other_participant_ids, and takes a main_storage as an argument to its various methods. Currently, the top-level Participant retains main_storage in its struct and passes it as needed. Here's my initial suggestion(s):

We have a ParticipantInfo (or some better name) struct which contains id, other_participant_ids, and main_storage. The caller is responsible for creating this struct. We can then pass this wrapped in Arc<Mutex<>> to the top-level Participant, which can pass this on to all the sub-protocols. Now, each protocol has access to the main_storage produced by the caller, as well as any local_storage it needs.
We change the Storage struct to a Storage trait with a separate "public" and "secret" storage API. It is up to the caller to implement that correctly (is that too risky?), but then internally we can use this new Storage API to store things properly.

Additional thoughts

Here are some additional thoughts worth considering in the above design:

Is it necessary to Arc<Mutex<>> the entire ParticipantInfo struct, or just main_storage? Probably the latter is sufficient?
As mentioned above, is it too risky to have the caller implement the Storage trait? We could have a default implementation that the caller could use while still allowing the caller to implement their own if necessary.
The Storage trait still needs to be designed, but I'd imagine it'd look similar to what we have now, except separating out calls to "public" and "secret" storage.

amaloz commented 1 year ago

FYI I started a branch with these changes here.

tyurek commented 1 year ago

I'm thinking it would be good to throw other_participant_ids into Storage, where it can be tagged with a session id, so that Participants can handle different peer groups in different sessions concurrently.

I like the idea of a Storage trait. And I agree with providing a default implementation that just stores things in memory like we do now, especially so we have something to use in tests (would it make sense to provide an Encrypted memory version too?). But I like this approach for making it modular

amaloz commented 1 year ago

One issue with moving the current Storage API to a trait is it exposes a bunch of internals (such as StorableType). I think the way around this is to have store just take a key-value pair, e.g.:

fn store<K: Storable>(&mut self, key: K, data: &[u8]) -> Result<()>;

And likewise for retrieve. This'll require a big refactoring of the code but should simplify the outward facing API greatly.

amaloz commented 1 year ago

Here's what I've got currently for the Storage trait. It seems to capture everything we need while hiding internals:

pub trait StorageT: Debug {
    /// Store `data` at the entry provided by `key`.
    fn store<K: Storable>(&mut self, key: K, data: &[u8]) -> Result<()>;
    /// Retrieve data given `key`.
    fn retrieve<K: Storable>(&self, key: K) -> Result<Vec<u8>>;
    /// Transfer data at entry `key` to `storage`.
    fn transfer<K: Storable, S: StorageT>(&self, key: K, storage: &mut S) -> Result<()>;
    /// Delete data at entry `key`, returning the serialized result.
    fn delete<K: Storable>(&mut self, key: K) -> Result<Vec<u8>>;
    /// Returns `true` if all `keys` are used.
    fn contains_batch<K: Storable>(&self, keys: &[K]) -> Result<bool>;

    /// Retrieve deserialized data given `key`.
    fn retrieve_deserialized<K: Storable, D: DeserializeOwned>(&self, key: K) -> Result<D> {
        let data = self.retrieve(key)?;
        deserialize!(&data)
    }
}

The one thing it doesn't capture is public versus secret storage. I think that can be handled either with a bool argument denoting where to store the data, or with separate store_secure etc. functions. I'd vote for the former for API simplicity.

amaloz commented 1 year ago

Here's the latest and greatest API. It contains a SecurityType annotation to let the implementation know whether to store the entry in secure storage or not.

/// A data storage API.
///
/// This trait defines a key-value store for storing public and private data during
/// the execution of the protocol. Whether a value should be stored securely or not
/// is specified by the [`SecurityType`] argument. **It is up to the implementer to implement secure
/// storage!**
pub trait StorageT: Debug {
    /// Store serialized `data` at the entry associated with `key`.
    fn store<K: Storable>(&mut self, key: K, data: &[u8], security: SecurityType) -> Result<()>;
    /// Retrieve data associated with `key`, returning the serialized result.
    fn retrieve<K: Storable>(&self, key: K, security: SecurityType) -> Result<Vec<u8>>;
    /// Transfer data associated `key` to `other`.
    fn transfer<K: Storable, S: StorageT>(
        &self,
        key: K,
        other: &mut S,
        security: SecurityType,
    ) -> Result<()>;
    /// Delete data associated with `key`, returning the serialized result.
    fn delete<K: Storable>(&mut self, key: K, security: SecurityType) -> Result<Vec<u8>>;
}

marsella commented 1 year ago

Thanks for making progress on this! I think my overall reaction is that this is a reasonable abstraction based on the storage we currently have, but think there are some improvements we could make to the current storage model that would lead us to a nicer trait. I'm wondering if we might try to break this down into a few internal refactors first before abstracting the trait.

For example, I don't like the current method of storing plain bytes even though we know the type of everything stored, and I don't like that we save the StorableType but don't actually tie that to the stored/retrieved type (which the caller has to pick/serde manually). I'd rather pass Rust types directly in the storage API. I'm not sure the best way to do this (maybe have the value type be enum StoredData which is like the current StorableType enum, but actually containing the data??)

Question about the abstraction: Do you think the key type will it always be the tuple of (stored type, session id, party id)? I'm trying to think about whether it's plausible that we'd change that index type or use different ones for different scenarios internally, or whether it'd be reasonable to make StorageIndex public (but the internals private) and use it directly. Is the fully generic key type going to trip us up at some point?

I like the security type flag. It's an improvement over what we currently have. I wonder if we could try to tie this to the stored data type, since we already know what the secure types are. This would just be an internal misuse-resistance thing to prevent us from calling the wrong one. For transferring, I wonder if that should take a security type or if we should document that the implementation should match the security type from the source storage in the destination?

I see in your branch that you've already done a bunch of work to pull out the trait, so maybe it's reasonable to pull it out but plan that it's not our "final version" -- we can follow up by identifying some good refactors in separate issues. Do you or @gatowololo have preferences on the ordering (refactor storage vs pull out trait)?

marsella commented 1 year ago

One other idea I just thought of is to have the library handle encryption itself; that is, there's a wrapper around the StorageT trait that's used internally and encrypts things that are marked Private before passing them to the StorageT implementation (and then StorageT would lose the SecurityType tag). This would avoid the user needing to worry about implementing a secure StorageT instantiation, which seems more misuse-resistant, if not some additional complexity on the library side.

The difficulty of this approach is that we now need to also handle key management for secure storage. I don't want to try to make that decision in a generic way (or even to make the decision of, like, how are we encrypting and what's your key size and so on).

amaloz commented 1 year ago

For example, I don't like the current method of storing plain bytes even though we know the type of everything stored, and I don't like that we save the StorableType but don't actually tie that to the stored/retrieved type (which the caller has to pick/serde manually). I'd rather pass Rust types directly in the storage API. I'm not sure the best way to do this (maybe have the value type be enum StoredData which is like the current StorableType enum, but actually containing the data??)

I really like this idea. To avoid exposing StoredData to the user I think we can keep the StorageT trait the same and have an internal store operation that remaps the StoredData values into key and data values to be consumed by the trait.

Question about the abstraction: Do you think the key type will it always be the tuple of (stored type, session id, party id)? I'm trying to think about whether it's plausible that we'd change that index type or use different ones for different scenarios internally, or whether it'd be reasonable to make StorageIndex public (but the internals private) and use it directly. Is the fully generic key type going to trip us up at some point?

Good question. I honestly don't know. A fully generic key type is certainly the most general, so if there is an inkling we'd need to store different key-value pairs I can see it being a benefit (that way the public API is stable). But it certainly is more general than what we need now.

I like the security type flag. It's an improvement over what we currently have. I wonder if we could try to tie this to the stored data type, since we already know what the secure types are. This would just be an internal misuse-resistance thing to prevent us from calling the wrong one. For transferring, I wonder if that should take a security type or if we should document that the implementation should match the security type from the source storage in the destination?

Yes, this is a great idea. As above, we can keep this machinery internal and still require the StorageT type to separate out public and private stores.

I see in your branch that you've already done a bunch of work to pull out the trait, so maybe it's reasonable to pull it out but plan that it's not our "final version" -- we can follow up by identifying some good refactors in separate issues. Do you or @gatoWololo have preferences on the ordering (refactor storage vs pull out trait)?

I was mostly doing that because it helps me reason through the API when I make the changes explicit in the code. I'm open to ditching the branch if we'd like to take a different approach to the refactor. Alternatively, I think what I have now might be a good starting point (in that it pulls out the StorageT trait [I'm also open to a better name than StorageT...]), and it should be straightforward to make some of the above changes on top of that version.

gatoWololo commented 1 year ago

I like the current direction you guys are proposing. Some thoughts (Part 1/N):

We can then pass this wrapped in Arc<Mutex<>> to the top-level Participant, which can pass this on to all the sub-protocols. Now, each protocol has access to the main_storage produced by the caller, as well as any local_storage it needs.

Is it necessary to Arc<Mutex<>> the entire ParticipantInfo struct, or just main_storage? Probably the latter is sufficient?

I like this approach. Specially having the Participant and Sub-Participants share the participant IDs via a reference. Ideally, A sub-participant would not get a reference to the whole main_storage, but only the subset it needs?

is it too risky to have the caller implement the Storage trait? We could have a default implementation that the caller could use while still allowing the caller to implement their own if necessary.

I think it depends how complicated the storage implementation ends up being for the caller. Either way, we will need a "reference" implementation for our own testing and prototyping.

I'm thinking it would be good to throw other_participant_ids into Storage, where it can be tagged with a session id, so that Participants can handle different peer groups in different sessions concurrently.

Right, I think one nice thing about the current implementation is the separation of code and state. We could even take it one step further: I have been thinking that we would instantiate a new Participant per session (passing in the necessary starting state). This would work well with our task-based Tokio async model our server will be using. We can spawn a new participant object to run on our Tokio tasks. So multiple concurrent sessions would not share any state with each other.

gatoWololo commented 1 year ago

I don't like the current method of storing plain bytes even though we know the type of everything stored, and I don't like that we save the StorableType but don't actually tie that to the stored/retrieved type (which the caller has to pick/serde manually).

Agreed! For example, StorableType::BroadcastSet appears to be associated with a message_votes: HashMap<BroadcastIndex, Vec<u8>> but this is not obvious except at the usage sites. It would be nice for the Storage to be aware of this.

I'd rather pass Rust types directly in the storage API. I'm not sure the best way to do this (maybe have the value type be enum StoredData which is like the current StorableType enum, but actually containing the data??)

Ideally the storage would work with Rust types directly. Our Rust types would implement ser/de still so that the user of our library could e.g. store them in some persistent storage.

This could be done a few ways: 1) As you pointed out, using an enum that associates the enum tag with the data:

enum StorableData {
   BroadcastSet(HashMap<BroadcastIndex, Vec<u8>>),
}

One challenge is that Storage is both used for main storage and local storage. So it needs to support many types of data and StorableTypes. We only need the calling application to manage main storage right? So I suggest having to separate types: LocalStorage and MainStorage this would significantly reduce the types of data our trait needs to allow for.

marsella commented 1 year ago

After some further offline discussion, we decided that this library should not interact with persistent storage at all. The calling application will handle storage and retrieval. This allows the calling application to store tss-ecdsa outputs alongside any use-case-specific context. The library should now:

take persistent-storage inputs as parameters when initiating a new protocol run, and
return persistent-storage items and their context (e.g. session ID, associated participant ID) to the calling application after a successful subprotocol run.

marsella commented 1 year ago

@amaloz I'm going to take over the remainder of this issue, which is writing up issues to get rid of all library-controlled persistent storage.

marsella commented 1 year ago

I made #203 #204 #205 and #213. Closing this as I think the plan is ready.