celestiaorg / nmt

Namespaced Merkle Tree
Apache License 2.0
112 stars 39 forks source link

Use precomputed hashes for certain padding shares as an optimization #208

Open cmwaters opened 1 year ago

cmwaters commented 1 year ago

As a celestia-oriented optimization, the following padding shares with

Could have the sha256 precomputed. In the HashLeaf method, we could simply identify if it is a padding share and if it has one of those namespaces and use the precomputed hash instead. Furthermore we could extend it to the HashNode method in the case that there are 2, 4, 8, 16 etc padding shares in a row.

rootulp commented 1 year ago

Agreed with the optimization. Note: there are no padding shares with the TxNamespace or PFBNamespace because sequences in the reserved namespaces don't conform to non-interactive default rules.

There are 3 types of padding shares (see specs):

  1. Namespace padding which can't be pre-computed because it varies based on namespace of the previous blob
  2. Reserved padding which can be pre-computed
  3. Tail padding which can be pre-computed
cmwaters commented 1 year ago

Yeah that makes sense. I was under the impression that the reserved namespaces could also have padding (i.e. they were still part of the blob share commitment rules)

cmwaters commented 1 year ago

Is it possible to quickly identify a padding share without checking that all the raw bytes are 0?

rootulp commented 1 year ago

Is it possible to quickly identify a padding share without checking that all the raw bytes are 0?

Yes. Padding shares have a sequence start indicator of 1 and a sequence length of 0 so a client only needs to parse the prefix of a share to determine if it is padding. Specifically, a client can parse the raw bytes of a share at index 29 (info byte which contains sequence start indicator) and 30 - 34 (sequence length) to determine if it is padding.

Note: IsPadding exists but it operates on an entire share.