Should we define multiple proof / path representations?

OR13 commented 1 year ago

Originally:

TODO: Do we need both inclusion path types? what properties does each type have?

Option 0

IndexAwareInclusionPath = [
    tree_size: int ; added here so we can compare
    leaf_index: int
    hashes: [+ bstr]
]

Option 1

IndexUnawareInclusionPath = [+ PathEntry]
PathEntry = [
    left: bool
    hash: bstr
]

EDIT: Note that this is CDDL above, and square brackets indicate arrays. Giving array elements names is just syntactical and similar to C-style structs. They are still arrays.

OR13 commented 1 year ago

Are both options equally performant in balanced and unbalanced trees? (not sure)

Do both options reveal the same information to a verifier? (no, tree size and member position in tree are revealed in option 0)

Does index positioning reveal "order over time time" ? ( I think so )

OR13 commented 1 year ago

There is some bias towards option 0 if we believe that communicating tree size happens in the header.

OR13 commented 1 year ago

I'm not a "COSE / CBOR" encoding expert, but it seems that there is a more compact representation for "index unaware proofs"

IndexUnawareInclusionPath = [+ [ bool, bstr ] ]

For large tree sizes where the size is not revealed and the member index is not revealed...

it could be more compact, since 2 int can be substituted for 1 bit per hash.

letmaik commented 1 year ago

There's also the variant that QLDB uses where the node hashes are sorted before hashing such that at inclusion path verification you don't need the direction or leaf index because you repeat the same sorting and know implicitly whether it goes left or right: see also OpenZeppelin. Obviously this only works for that specific tree type.

OR13 commented 1 year ago

Another consideration is "multiple proof disclosures"...

Its possible that one option performs better than the other after a certain percentage of the tree is revealed.

OR13 commented 1 year ago

As a principle, it would be nice to define the "most compact possible" representation... even if its not the "most popular one".

letmaik commented 1 year ago

I'm not a "COSE / CBOR" encoding expert, but it seems that there is a more compact representation for "index unaware proofs"

You mean like this?

IndexUnawareInclusionPath = [int, [bstr]]

Assuming the int is up to 64 bit, then it would support large enough trees and act as a direction flag bitvector. With CBOR, integers would use the most compact type in most libraries, which typically would be 1 or 2 bytes (trees up to depth 16).

OR13 commented 1 year ago

@letmaik yes that works as well... in my first attempt to make compact proofs I used a bit string for directions, and then concatenated with hashes, it looked sorta like this:

010... hash-0, hash-1, ...

https://github.com/transmute-industries/verifiable-data/blob/1319ff2bf72ec7da1a90acfee7a424ab2962ef76/packages/merkle-proof/src/index.ts#L203

I removed this (and the pako compression) in the next revision to keep the algorithms simpler / cut down on dependencies.

OR13 commented 1 year ago

Example QLDB Proof with other stuff wrapped around it:

{
  "Proof": {
    "IonText": "[{{xwtiJrjaFEezJtfPnZZJkZniHpif/Ko58kEcdrSvUD0=}},{{DkEJ8WxOwHcrLtvv+hCmur3LIuXm5TP4qI+oFdV+nPI=}},{{H5SvpW4VsFRBbl2XEpJpAydwetVSixUi+epX7CD8FWY=}},{{UWx7JmpQLZEDdprLGz22cfBbjZkwFKIDjOj5ejRueVQ=}},{{4tujyvTE8a0N3pSk8ItGjkxho7mtOyvIm/mSW9LtrA4=}},{{eIWAJefiyGtwquKvX3dxs50g65aWiC8457IR1CAG8wg=}},{{BCAa4xj0NhToMx694Tx+d7dlRpF8cWIApDLVzgu26pY=}},{{5UgPeK4mx6ryvdkBZwkfOV1lsR60+7WLpRFKCj/DGX8=}},{{PlGnVVc976YN5ven3wluckeFTpEpzbC6Xicjg0+51aU=}},{{NQWgMUnpqoKrpyTsZgkasf0NxHJe1Wi5LdmpfNFNin0=}},{{YGmPTziy2xt7wU6n3QaFCbBKE7FMr82CajoMz/PQX7s=}},{{OjELd9pepTZiojuX3RkEbvo6UdBlPM3m+ht0RVSmobY=}},{{8AqUg8lzSIiAfUVWmtFjBKQGFK2gHwAOE9r/G403YA4=}}]"
  },
  "Revision": {
    "IonText": "{blockAddress:{strandId:\"5EhcZ4QwhGvDaV3PN3GoRd\",sequenceNo:672},hash:{{k07YWpQB13KYIKWovxa3GPuc+b+WmoWA3rUK7Dx1Mz8=}},data:{lastName:\"Doe\",age:42,firstName:\"Timmy\"},metadata:{id:\"7j4KlV1swnTBGJhuF6l4kZ\",version:0,txTime:2023-02-26T16:33:34.423Z,txId:\"47VIqjNef3N9p85eEn0dDO\"}}"
  }
}

An example CBOR ish minimal QLDB receipt:

IndexUnawareInclusionPath = [+ bstr]

letmaik commented 1 year ago

Another variant of inclusion proof is simply providing all other members by hash. Then the whole tree has to be re-built to get to the root. This would only realistically work for the disclosure case (as the trees are typically small), rather than transparency logs. I wonder if there's ever a point to choose this form compared to path-based proofs.

OR13 commented 1 year ago

@fournet and I discussed, we feel that each tree alg should define only 1 inclusion path representation, and supporting multiple inclusion path representations for the same tree alg should be discouraged.

OR13 commented 1 year ago

We propose to close this issue in https://github.com/ietf-scitt/draft-steele-cose-merkle-tree-proofs/pull/9

ietf-scitt / cose-merkle-tree-proofs

Should we define multiple proof / path representations? #6

Option 0

Option 1