paritytech / substrate

Substrate: The platform for blockchain innovators
Apache License 2.0
8.39k stars 2.65k forks source link

Refactor the core datatypes #160

Closed gavofyork closed 5 years ago

gavofyork commented 6 years ago
mod new1 {
    use super::{Number, Hash};

    /// The header of a block. This is the minimal amount of data required to sync the chain at
    /// minimal viable security. It contains nothing more than cryptographic references.
    struct Header {
        /// Cryptographic reference to the previous block's header.
        parent: Hash,
        /// Cryptographic reference to pieces of extrinsic information (assumed to be an ordered
        /// list of `Extrinsic`, keyed into a Merkle trie by a 0-based index). Extrinsic
        /// information is usually just a set of transactions, but we make no specific
        /// requirements that are typical of transactions, e.g. signatures, in this case.
        extrinsics: Hash,
        /// A hash of the post-execution information ("receipt") for this block.
        receipt: Hash,
    }

    /// A block-level event. Encodes something that happened in-block which is important to
    /// light-clients. There are a number of reserved event types.
    type Event = Vec<u8>;

    /// A block receipt - contains all relevant post-execution information of the block.
    struct Receipt {
        /// The number of blocks that have preceded this in the chain.
        number: Number,
        /// The Merkle-trie hash of the final state of storage.
        storage_root: Hash,
        /// A list of events that this block's execution generated. Will contain parachain activity
        /// information as well as any important information on authority set transitions for light
        /// clients.
        /// NOTE: Information on "transaction receipts" (i.e. logs) are not contained here. Chains that
        /// need them will clear a particular location in storage and place them in storage during
        /// execution. The runtime will then form a merkle-trie root from this stored data and deposit
        /// it as an event in the digest. Native, runtime-specific, code will read storage for the logs
        /// and place them in a database so that (again, runtime-specific) light-client logic is able
        /// to query for particular logs and verify data through the digest.
        digest: Vec<Event>,
    }

    /// A header that includes the receipt, so that everything needed for a light-client to sync is contained.
    /// Note that a corresponding `Header` (and thus the header hash) can be constructed through the hash of
    /// the receipt and the `parent` and `extrinsics` fields.
    /// This structure is requested by light-clients when syncing blocks that happen after a trusted foundation
    /// block (i.e. genesis or a recent hard-coded hash+number failsafe). For syncing blocks prior to that point,
    /// where it is not necessry to track authority changes on every block, the normal `Header` can be used.
    struct FatHeader {
        parent: Hash,
        extrinsics: Hash,
        receipt: Receipt,
    }

    /// A single, isolatable chunk of extrinsic information.
    type Extrinsic = Vec<u8>;

    /// A full set of `Extrinsic` data.
    type Extrinsics = Vec<Extrinsic>;

    /// A "full" block.
    struct Block {
        /// The header.
        fat_header: FatHeader,
        /// The extrinsic information to which the header refers.
        extrinsics: Extrinsics,
    }
}
rphmeier commented 6 years ago

Keeping storage_root outside of the main header struct will put a non-negligible load on light clients' bandwidth and memory -- they'll have to obtain and check the entire receipt of a block every time they want a state proof.

gavofyork commented 6 years ago

light clients that actually need the storage_root in all instances would just grab a ladder of FatHeaders, which are pretty much equivalent to the existing Header.

rphmeier commented 6 years ago

I'm not convinced that's any better, because then the clients are still downloading the receipt when they don't need that data at all. And then they have to download the header and the fat header, because the header hash isn't computed from the fat header.

gavofyork commented 6 years ago

If you have a FatHeader then Header is superfluous as it can always be derived from FatHeader (just hash the Receipt).

Light-clients shouldn't need to always download the FatHeader - if it's just a header-chain for sync/catchup, then the client isn't interested in extrinsics and mostly isn't interested in receipt. The instances where a light-client would care about receipt (e.g. in a validator-set change), a FatHeader can be provided by the serving-node instead of a Header.

gavofyork commented 6 years ago

@rphmeier if you're still unconvinced, maybe we can go over this in more detail in person?

gavofyork commented 5 years ago

Not going to happen in time for 1.0. Will shelve until 2.0 is mooted.