mindbeam / unbase

Other
39 stars 9 forks source link

Convert memo identification to hash / merkel #8

Open dnorman opened 7 years ago

dnorman commented 7 years ago

Initial efforts used a slab-managed counter for Memo id for simplicity sake. This is of course improper for the unbase MVP.

Each Memo should be identified as the hash of it's full contents without exception. Nothing considered to be memo data shall be omitted from this hash. This is essential to avoid improper collisions of non-identical memos. This means the full immutable portion of the data which constitutes a Memo:

Peering status or any other metadata is of course not identifying, and should be excluded.

This should entail multihashing, such that different hashes may be used, and the system may be evolved in response to cryptographic threats. The multihash data type should be a compact binary representation of:

Important design feature: For new memos, the ID should not be actually calculated until such time as it is to be serialized for expression over a non-OS-process portion of the network.

Why? It is desirable to avoid generation of this memo id for as long as possible, and in theory should not be required for transfer or projection by any slab in the same process. This is because the links between the memos are direct, not dependent on their identity.

Given that some Memos, depending the parameters set forth by policy, will be generated, used, and die ( declared to have a durability target of zero ) all within the same slab or OS process.

If done correctly, this may allow the system to approach the performance of "native" data structures, as hashing effort can be deferred, delegated, and in some cases skipped entirely.

The most likely challenge that this will present is that the Slab will no longer be able to index Memorefs by memo ID, as the memo may need to be resident before the Memo ID has been calculated. Some discussion may be required to clarify this.