[Documentation] How to calculate block hash and block body hash.

michele-nuzzi commented 1 year ago

It is my understanding that the block hash is the hash of the header of a block;

the hash of the header would be the one to be later included in the next block header as prev_hash field(null in case the block in question is genesis ?)

the header of a block includes as a field the hash of the body in the cddl commented with ; merkle triple root

in the source the best I could find is hashAlonzoTxSeq

where it seems the hash of the block body is calculated as something like this

hash(
    hash( toCbor( bodies ) ) + 
    hash( toCbor( witnesses ) ) + 
    hash( toCbor( meatadata ) ) + 
    hash( toCbor( isValidSet ) )
)

( where "+" is the concatenation of bytestrings )

Questions

1) is my understanding of the block_hash being different than block_body_hash correct?

2) the Hash type kinda hides the hash algorithm, is it safe to assume we are talking of blake2b with digest of 32 bytes?

3) what is the ; merkle triple root comment for? I see 4 parts being hashed in hashAlonzoTxSeq. are the transactions bodies hashed as a merkle tree? or just as a cbor array of tx bodies?

4) CBOR can represent the same data in many ways, is there a standard encoding for the block elements?

JaredCorduan commented 1 year ago

is my understanding of the block_hash being different than block_body_hash correct?

I think so. I don't see where you got the variable name block_hash from, but yes, each block header body contains a field prev_hash which is a hash of the previous block header (and yes, each block header body contains a hash of its body, more on that below). The types in Figure 53 of the shelly ledger spec (hopefully) make this clear.

the Hash type kinda hides the hash algorithm, is it safe to assume we are talking of blake2b with digest of 32 bytes?

That's correct. all the crypto primitives are explained in appendix A of the shelley ledger spec.

what is the ; merkle triple root comment for? I see 4 parts being hashed in hashAlonzoTxSeq. are the transactions bodies hashed as a merkle tree? or just as a cbor array of tx bodies?

It's more like a merkle bonsai :) , and "triple" should be replace by "quadruple" (it used to be a triple up until the Alonzo era, when the "is valid" flags were introduced to transactions). so yes, it's just the hash of four concatenated hashes. the idea being, for example, you could throw away the witnesses after you've checked them and only need the hash to verify stuff.

CBOR can represent the same data in many ways, is there a standard encoding for the block elements?

Nope. There is such as thing an canonical CBOR, but we explicitly chose not to use it for hashes on user-supplied data. For the block body hash, you need to concatenate the exact CBOR bytes supplied by the SPO for each of the four components.

michele-nuzzi commented 1 year ago

For the block body hash, you need to concatenate the exact CBOR bytes supplied by the SPO for each of the four components.

If I build a block on my own will any other node be able to indipendently verify my block?

JaredCorduan commented 1 year ago

If I build a block on my own will any other node be able to indipendently verify my block?

yes! provided that the node in question does not re-serialise the block when computing the hash. that's the heart of the issue being discussed in https://github.com/input-output-hk/cardano-ledger/issues/2943.

Note that appendix section A.1 of the shelley ledger spec issues a warning about re-serialization.

IntersectMBO / cardano-ledger

[Documentation] How to calculate block hash and block body hash. #3777

Questions