chronaeon / beigepaper

Rewrite of the Yellowpaper in non-Yellowpaper syntax.
MIT License
783 stars 85 forks source link

2.1.1 confusion #19

Open rkapka opened 4 years ago

rkapka commented 4 years ago

Section 2.1.1. Merkle Particia Trees states that

This allows the state data structure itself to represent not only the intrinsically correct paths in the data, but also the requisite cryptographic proofs which go into making sure that a piece of data was valid in the first place.

To me the highlighted part means that the modified tree allows for cryptographic proofs, implying that a non-modified tree would not allow this. If this was the intent, then in my opinion it is not true - any Merkle tree allows such proofs. This article from the official Ethereum wiki states that Merkle Patricia Trees were chosen because of optimization opportunities.

morenoh149 commented 4 years ago

that link is to https://github.com/chronaeon/beigepaper/issues/url did you mean to link to the wiki?

rkapka commented 4 years ago

Oops 😅

I have edited the hyperlink.

chronaeon commented 4 years ago

@morenoh149 thanks for the link! @rkapka good points -- how would you re-phrase it?

I'm merely trying to drag the line between inherent vagueness and too much detail, and in this case it looks like I missed the point entirely! I thank you for this correction.

I'll link the appropriate section here:

Optimization Merkle Patricia tries solve the inefficiency issue by adding some extra complexity to the data structure. A node in a Merkle Patricia trie is one of the following:

NULL (represented as the empty string) branch A 17-item node [ v0 ... v15, vt ] leaf A 2-item node [ encodedPath, value ] extension A 2-item node [ encodedPath, key ] With 64 character paths it is inevitable that after traversing the first few layers of the trie, you will reach a node where no divergent path exists for at least part of the way down. It would be naive to require such a node to have empty values in every index (one for each of the 16 hex characters) besides the target index (next nibble in the path). Instead we shortcut the descent by setting up an extension node of the form [ encodedPath, key ], where encodedPath contains the "partial path" to skip ahead (using compact encoding described below), and the key is for the next db lookup.

In the case of a leaf node, which can be determined by a flag in the first nibble of encodedPath, the situation above occurs and also the "partial path" to skip ahead completes the full remainder of a path. In this case value is the target value itself.

The optimization above however introduces some ambiguity.

When traversing paths in nibbles, we may end up with an odd number of nibbles to traverse, but because all data is stored in bytes format, it is not possible to differentiate between, for instance, the nibble 1, and the nibbles 01 (both must be stored as <01>). To specify odd length, the partial path is prefixed with a flag.

Based on this reading, it seems clear to me that the Beigepaper would benefit by including the gist of this whole excerpt and excluding my initial assertion that MP trees somehow contribute to the overall veracity of the Blockchain data structure.