starlinglab / authenticated-attributes

Authenticated Attributes project by the Starling Lab
MIT License
6 stars 1 forks source link

Explore Value Types (Including Binary for Timestamp Proof) #4

Closed katelynsills closed 1 year ago

katelynsills commented 1 year ago

The values in our key-value store will have multiple types. For instance, the actual value is likely to be a UTF-8 string, but we also want to include the timestamp of the attestation and a signature, which would cause us to want to use JSON with 'timestamp' and 'signature' as keys in that JSON. Another complication is that our timestamping proof, if we use OpenTimestamps, is a binary .ots file, and we need to include that in the key-value store.

We need to figure out how to best encode all of the information that we need to put in the key-value store, given what is available to us as options in Hyperbee/Hypercore.

Methods

@RangerMauve's HyperbeeDeeBee uses BSON so if we use it, we might be able to avoid going to a low-level solution, like doing our own conversions. Let's start with HyperbeeDeeBee and see whether we hit any roadblocks. See comment below.

For right now, we cannot use HyperbeeDeeBee due to the AGPL license (Starling must use MIT). For this task, we would like to write a thin BSON wrapper around HyperBee such that our values can be read as json, utf-8, or binary as the need arises, but are all ultimately stored as binary, using the 'binary' HyperBee setting instance-wide.

Rather than BSON, we ended up using CBOR (see PR here), which has two big benefits: it's guaranteed to be deterministic (unlike BSON) and it is already supported in IPLD

Background

From the Hyperbee documentation:

Property Description Type Default
valueEncoding Encoding type for the values. Takes values of 'json', 'utf-8', or 'binary'. String 'binary'
keyEncoding Encoding type for the keys. Takes values of 'ascii', 'utf-8', or 'binary'. String 'binary'

From the HyperCore documentation:

Property Description Type Default
valueEncoding one of 'json', 'utf-8', or 'binary' String 'binary'
RangerMauve commented 1 year ago

Yeah I would strongly advise using BSON instead of rolling new stuff or having to deal with custom JSON bits. In particular, using BSON means that you can sort stuff by date more easily with the built in indexing support.

RangerMauve commented 1 year ago

It might be worth it to store the "raw" OTS stuff in the document, but then add a Date property along side it so that you can index over it and perform range queries easily.

katelynsills commented 1 year ago

Thanks so much, @RangerMauve. I'll start with HyperbeeDeeBee then, and will let you know if we hit any roadblocks there!

It might be worth it to store the "raw" OTS stuff in the document, but then add a Date property along side it so that you can index over it and perform range queries easily.

This sounds like what we want ideally from a timestamping service - the timestamp date as a signed attestation just like any other property/value. None of the main existing timestamping services give this because they think it undermines the untrusted nature, but I think this is extremely helpful in addition to a proof. Not everyone wants or needs to follow the entire proof, just like not everyone needs to run their own Bitcoin node. It'd be nice if they did, but we all have limited resources.

Does that cover what you meant here, or is there something else too?

RangerMauve commented 1 year ago

Yeah! If I understand correctly, I think it fits into the "trust but verify" mode of thought, where you can have indexes and "raw" values for working with data, but you have an easy path to verifying data within it outside of trusting the index.

makew0rld commented 1 year ago

Just wanted to flag that hyperbeedee's AGPL license will conflict with our current MIT license for the repo.

katelynsills commented 1 year ago

@RangerMauve, is there any chance that hyperbeedee's license could change to be compatible with MIT?

RangerMauve commented 1 year ago

Hmm, I'm not sure TBH. Is MIT a hard requirement? I'd really like to avoid corpos coming in and close sourcing work behind paywalls and the such. 😅

RangerMauve commented 1 year ago

Since you aren't modifying the hyperbeedeebee source it might be enough to just link to it somewhere, right?

benhylau commented 1 year ago

Not really. We wouldn't be able to use it at all with AGPL. Many orgs have a policy of basically not taking any code that is AGPL bc of the risk to their existing codebase.

katelynsills commented 1 year ago

This was updated to using dag-cbor 2023-04-14 in this PR https://github.com/starlinglab/uwazi-hyperbee-prototype/pull/14