Daohub-io / cap9

Capability-based security protocol for smart contracts
Apache License 2.0
22 stars 10 forks source link

Refactor into cap9-core #176

Closed JakeOShannessy closed 5 years ago

JakeOShannessy commented 5 years ago

As previously discussed there are some significant areas in need of refactoring. One of the critical ones I see is that currently cap9-std (which is linked into every contract) depends on validator. This is undesirable from an architectural perspective, and it brings in an unnecessary dependency for contracts other than the kernel (hopefully it would get optimised out, but we don't want to rely on that).

This PR does that and also tries to make the encoding methods we use more uniform. In this pull request there are 2 types of encoding that are clearly written out in the types:

These are implemented by Deserialize<u8> and Deserialize<U256> respectively. As part of this refactor, implementations for each type are split out. For example, this reduces from_u256_list from a large custom loop to a few lines of dispatch to decode the relevant enum types.

Another change is that each capability is split into it's own file. This should make them easier to manage as the definition for each capability is very uniform. Each file contains exactly (and only):

  1. The index/code of the capability.
  2. The size of the capability.
  3. The structure of the capability.
  4. The subset behaviour of the capability.
  5. The 32-byte deserialisation of the capability.
  6. The 32-byte serialisation of the capability (in progress).

This is of course only a work in progress but I'm open to thoughts and suggestions. It needn't fix a everything, simply that rearranging the modules was quite a breaking change and I thought now might be the easiest time to do so.

TODO:

JakeOShannessy commented 5 years ago

Note that all the per-capability parsing logic from from_u256_list is preserved, just is now located under the impl Deserialize<U256> for each capability, and the looping indices have been replaced with an iterative cursor (as used elsewhere in the code-base).

Latrasis commented 5 years ago

I'm thinking if we're separating each cap in a file, wouldn't it make sense to have each cap include the corresponding syscall implementation with a trait definition?

We could extend AsCap trait like so:

pub trait AsCap {
    const CapType: u8;
    type SysCall: ?Sized;
    type SysCallError;
    /// Execute Capability with given Syscall
    fn execute(&self, syscall: &Self::SysCall) -> Result<(), Self::SysCallError>;
    /// Check if Subset
    fn is_subset_of(&self, parent_cap: &Self) -> bool;
}
JakeOShannessy commented 5 years ago

I've been thinking about it and I'm not sure. I think placing them both in the same file makes sense. I was certainly thinking about that in regards to the indices. I guess the question is: Is the correspondence between capabilities and syscalls defined to be 1:1, or is that just how it is at the moment?. We could always try and see.

JakeOShannessy commented 5 years ago

I think it will have similar issues to the reasons we separate NewCapability and Capability. They may correspond, but they are intrinsically different. For example: we are often working with capabilities that don't have parameters (e.g. StoreWrite(0x8000,2)). There is no way to execute this, but we want to check if it is a subset of another capability (for registration). Therefore is_subset_of and execute cannot be in the same trait.

Thoughts?

Latrasis commented 5 years ago

I think it will have similar issues to the reasons we separate NewCapability and Capability. They may correspond, but they are intrinsically different. For example: we are often working with capabilities that don't have parameters (e.g. StoreWrite(0x8000,2)). There is no way to execute this, but we want to check if it is a subset of another capability (for registration). Therefore is_subset_of and execute cannot be in the same trait.

Thoughts?

I'm not sure I follow. The parameter for StoreWriteCap::execute would be the Write Syscall. So this would be:

impl AsCap for StoreWriteCap {
     //...
     type Syscall = WriteSyscall;
     type SyscallError = (); // Some Error

     fn execute(&self, syscall: WriteSyscall) -> Result<(),SyscallError> {
         // validation...
         let value_h256: H256 = syscall.value.into();
         pwasm_ethereum::write(&syscall.key.into(), &value_h256.as_fixed_bytes());
         Ok(())
     }

}

Or am I misunderstanding something?

Latrasis commented 5 years ago

I think if we do refactor, it should make sense to replace #{to, from}_u256_list since they add confusion.

JakeOShannessy commented 5 years ago

Oh that could work, so then the syscall just becomes parameters for the capability. That might help reduce moving parts.

Did you make that comment before or after that commit? Behind the scenes they're gone, I'm just doing it piece by piece.

Latrasis commented 5 years ago

Looks good, my only concern is that there's a lot of allocation involved when handling large types using the cap9_core::Read in Deserialize#deserialize trait. But I we can leave that for later.

JakeOShannessy commented 5 years ago

Looks good, my only concern is that there's a lot of allocation involved when handling large types using the cap9_core::Read in Deserialize#deserialize trait. But I we can leave that for later.

Sure, but can you elaborate? The use of iterators should be fairly cheap, and there doesn't seem to be any heap allocation that wasn't there before, unless I'm just missing it.

Latrasis commented 5 years ago

Looks good, my only concern is that there's a lot of allocation involved when handling large types using the cap9_core::Read in Deserialize#deserialize trait. But I we can leave that for later.

Sure, but can you elaborate? The use of iterators should be fairly cheap, and there doesn't seem to be any heap allocation that wasn't there before, unless I'm just missing it.

Oh ok, I didn't run twiggy yet. I guess the only prominent case is when deserializing a Payload in a SyscallActionCall.

JakeOShannessy commented 5 years ago

Yeah, and that's now a single a allocation which is kind of inherent. I think the only way to optimise it any further is if you could optimise serialisation and deserialisation to become a single step where possible, but that's definitely another PR.

One thing I was tempted to do was modify the read trait (now that we don't need to be compatible with the instruction parser) to skip a few more memcpy's, but that's a marginal gain and also probably belongs in a separate PR.

JakeOShannessy commented 5 years ago

I'll merge now, but if there are any other marginal things you were thinking about, you could always chuck them in an issue and we can address in another PR when we feel we have the space.