Open OR13 opened 4 years ago
@csuwildcat @dmitrizagidulin @msporny @dlongley @tplooker
I'm not sure how familiar you all are with automerge.... but the above structure works... I have tested for it both using JWE/JWS and a few CWS/CWE experiments.... with these proposed changes accepted, we could construct REST interfaces for Hubs, and a Hub Client on top of the existing EDV infrastructure.
What if the user in this scenario wants to grant multi-recipient encrypted access to CRUD some subset within that Music vault to External Party A and another subset to External Party B, and those subsets of objects in that Music vault overlap. I am trying to understand how this structure deals with a sea of Venn unions of data encryption and access within a given target set of Music. Hopefully encryption and permissions are not set vault-wide? I would love to avoid data duplication and other ugly deoptimizations, if possible. (not saying those exist in this scheme, just trying to understand how you are thinking of handling this stuff)
@csuwildcat Here is a JWE
{
"protected": "eyJlbmMiOiJYQzIwUCJ9",
"recipients": [
{
"header": {
"kid": "did:key:z6Mkf8unjmyqsnDtZAjZkdNhw3LZWm5x9u3bbHCEdenD1Agq#z6LShX3PmBwYHGh8JL82zm3x8uT3bWEbLmfos66McREoEfvo",
"alg": "ECDH-ES+A256KW",
"epk": {
"kty": "OKP",
"crv": "X25519",
"x": "pK5QE4-dwpPdjejlB3VERU9XCy1t4xfa-JNUDVa9iVs"
},
"apu": "pK5QE4-dwpPdjejlB3VERU9XCy1t4xfa-JNUDVa9iVs",
"apv": "ZGlkOmtleTp6Nk1rZjh1bmpteXFzbkR0WkFqWmtkTmh3M0xaV201eDl1M2JiSENFZGVuRDFBZ3EjejZMU2hYM1BtQndZSEdoOEpMODJ6bTN4OHVUM2JXRWJMbWZvczY2TWNSRW9FZnZv"
},
"encrypted_key": "DwOEbW0OvtnQaqL4gc6_9Za1vzHrrLptI_UsPsGWFoBlUcASWP5qWQ"
}
],
"iv": "Et_yCe5BAWtSiAm2H3GEh192zNQiNA4d",
"ciphertext": "WC9zeH_Q90Z34VvX7Vsb2nK42qjZch2n-x2RweSjmVyVOxu__yAY870u5sRtaOjTPSNtxKoxHFNTbsVW2M5vlXTPStNtxdcGK8s2qPI_diR8E3E3pzqKr8iShZ2c3wuywILcgZWrdYlmzW9tcdBjLAnBdxbWdhqxwZNKLIu-11edpXA0KOra8qhK55mI8k_WUDTudV1w7aYVPFtngCwNy1hN4JsAGm1_NtB_WpXtua10oQ-PpP6d18i7c3jYCMZ56oaGCn5I1hf3yCO2OKgVJhxCsA2LzAu9gKxSm9ZPjhqrK5iRXUaE4lLWZNahgf_MRiNn5MDp7sN0GJ4IJFTs2On0_W6llwWgttkNiqtcsx48PiwlKgO2oimB0L7Y-bVpcinCpfDCK-UG6FGKaw7f1HsjWo4uthHdnCOm_Hw8dsSc7IPh0cORg4qbtAS4l_HDbPQroMlJIuLeOZqwMT55Ux32f3IfeVP5_1qitnigamOgHsfjuAV6ttKEgsEiDoAqa7kOQy_pB5jXkkJ57FURfKSG__hkbzm2L88djfaDAFFAz-7W0LvaEM4Dwew_-kAnoDJCBPa5MPG4W7MpXZhiafIuZsaD_Xk9OprHxFV_nXU8ztl0NKoc_H3Qg3l1D00wJQI_TWPqRfSqc5qHyRrh_TLRuTXpK2I2Hh3v-N_HrNotWN8p-McFnaV3cRtOMvLq44kF_X4_NPH8s7wYQ4yFkd2ffiFD",
"tag": "TSiAPqHT6t0wT1rWJppicQ"
}
If we are smart about JWE encoding, we can convert this to a content addressed URI, like so:
https://example.com/content/(CID of ciphertext)?jwe_meta=<encoded meta data>
Where encoded meta data
... contains recipients:
[
{
"header": {
"kid": "did:key:z6Mkf8unjmyqsnDtZAjZkdNhw3LZWm5x9u3bbHCEdenD1Agq#z6LShX3PmBwYHGh8JL82zm3x8uT3bWEbLmfos66McREoEfvo",
"alg": "ECDH-ES+A256KW",
"epk": {
"kty": "OKP",
"crv": "X25519",
"x": "pK5QE4-dwpPdjejlB3VERU9XCy1t4xfa-JNUDVa9iVs"
},
"apu": "pK5QE4-dwpPdjejlB3VERU9XCy1t4xfa-JNUDVa9iVs",
"apv": "ZGlkOmtleTp6Nk1rZjh1bmpteXFzbkR0WkFqWmtkTmh3M0xaV201eDl1M2JiSENFZGVuRDFBZ3EjejZMU2hYM1BtQndZSEdoOEpMODJ6bTN4OHVUM2JXRWJMbWZvczY2TWNSRW9FZnZv"
},
"encrypted_key": "DwOEbW0OvtnQaqL4gc6_9Za1vzHrrLptI_UsPsGWFoBlUcASWP5qWQ"
}
]
which can then be added to or removed without the content id of the cipher text changing....
Now to the question of "who gets to access cipher text".... if your CID system is IPFS and you are on the public internet.... everyone!
if you are on private IPFS, and you modulate your peer set to logically correspond to JWE recipients.... thats how EDVs work today....
So if the Documents are JWEs with special headers.... and they contain AutoMerge Deltas... and the peer set / authorization set is controlled by the storage provider, and the storage provider is honest (modulates peers according to the preferences described in the JWE headers).... then I believe thats everything you are asking for....
Can you refine your question further now?
Important side note on vector clocks: http://pl.atyp.us/wordpress/index.php/2010/03/conflict-resolution/ ... and note that automated conflict resolution in open systems is an even more challenging problem than closed ones. Our work here, of course, adds the additional complexity that we want to minimize the information the server knows about the data.
We need to analyze the privacy difference between exposing a simple sequence number to address inconsistencies that can arise just due to the partition between the client and the server vs. exposing "automerge deltas" in some way that is intended to address more complex synchronization concerns across servers.
agree, we should discuss how sequence numbers are used, and their relationship to indexes... my assumption right now is that hubs resources are built on top of edv documents, and that the default strategy of "no additional data is needed" is accurate. I am waiting for a counter proposal from @csuwildcat .
What if the user in this scenario wants to grant multi-recipient encrypted access to CRUD some subset within that Music vault to External Party A and another subset to External Party B, and those subsets of objects in that Music vault overlap. I am trying to understand how this structure deals with a sea of Venn unions of data encryption and access within a given target set of Music. Hopefully encryption and permissions are not set vault-wide? I would love to avoid data duplication and other ugly deoptimizations, if possible. (not saying those exist in this scheme, just trying to understand how you are thinking of handling this stuff)
@csuwildcat This is a really important point and a question I have with the current architecture.
When designing the Verida Datastore, I redesigned the whole system after a few false starts to ensure subsets of encrypted data could be appropriately permissioned across multiple applications and then syncronized in both directions.
It might help for us to understand what we're hoping for as compared to what Dropbox does now. Here's a screen shot.
Note Connect apps...
in the lower right where I would authorize access by other apps that I, as the owner, might or might not control.
There are other features illustrated.
@agropper they're expressing the same concern that I have: I don't get a sense that the architecture of the current EDV stuff is really attuned to saying: "I have encrypted 1000 objects spanning hundreds of different types located on a remote instance, and I want to give 100 different entities access to different subsets of those objects, without duplicating the objects, creating folders, or any other form of segmentation. I want to encrypt the data such that I can create a 'sea of access Venn diagram overlaps' over the 1000 object set by simply issuing them a permission secret that contains the access capability + a decryption key that is only usable for decrypting the subset of the objects they are allowed to access" <-- this is what you need for a real-deal multiparty decentralized app datastore, and we need to make sure that's possible. If the foundations don't make that easy, we need to change the foundations, not change our requirements to fit whatever the foundations can't do as of today.
@csuwildcat What you want is very reasonable as a use-case, I just don't know how to think of it in technical terms.
/sync
should be exposed, and it should leverage the JWE headers to return a result set of EDV Documents.CRDT_TYPE: AutoMerge
contain AutoMerge deltas.Developer user story
/sync
on tablet....