decentralized-identity / decentralized-web-node

Decentralized data storage and message relay for decentralized identity and apps.
https://identity.foundation/decentralized-web-node/spec/
Apache License 2.0
402 stars 79 forks source link

Support both folder based and schema based object storage #190

Closed ianconsolata closed 1 month ago

ianconsolata commented 2 years ago

Here is a brief writeup of that tries to summarize the discussion we had on the call today, and includes a proposed implementation direction suggested by @dmitrizagidulin based on the approach outlined in https://www.preveil.com/wp-content/uploads/2022/02/PreVeil_Security_Whitepaper-v1.4-Feb-22.pdf.

Right now, DWNs create a unique encryption key for each schema string / data type, and encrypt all files with that schema using that key. That key is then shared with users and applications when the user grants them access to files in a particular schema-based collection. This allows individuals to easily share all data of a particular type, but requires files to be copied and re-encrypted to include them in a second schema-based collection.

The proposed implementation differs in that instead of encrypting data with a per-schema key, it encrypts each file with a unique symmetric key, and then just encrypts copies of that key instead of copies of the encrypted file.

Adding new files to a schema collection (i.e. MusicPlaylist) Current: Encrypts file with schema key, uploads encrypted file. Proposed: Encrypts file with a new asymmetric file key, encrypts file key with schema key, uploads encrypted file key and encrypted file.

Retrieving a file from a schema collection: Current: Retrieves encrypted file, decrypts with schema-specific key. Proposed: Retrieves encrypted file and encrypted file key. Decrypts file key with schema-specific key, decrypts file with decrypted file key.

Sharing all files in a schema collection with a new user Current: Shares schema key with new user. Proposed: Shares schema key with new user.

Revoking access to all files in a schema collection from an existing user Current: Re-encrypts all data with new schema key, share new schema key with the new list of approved users. Proposed: DOES NOT ROTATE FILE KEYS. Re-encrypts all file keys with new schema key. Shares new schema key with new list of approved users. Revoked user still has the file encryption key, but no way to get updated copies of the encrypted data because of AuthZ.

The benefit of this solution is that multiple sets of encrypted keys can easily co-exist without extreme storage requirements, allowing the user to create and share arbitrary groupings of data, whether organized by schema, by folder hierarchy, by tags, or by some other mechanism.Each new encryption index would require the user to copy and re-encrypt the file keys according to the new sharing scheme, but the file itself would only need to be stored once.

alanhkarp commented 2 years ago

Revoking by changing keys and distributing new ones seems like a lot of work and complicates chained delegation. Can’t you just revoke the appropriate capabilities? That being said, the re-encrypting strategy was used in an electronic medical records project I participated in, so it does work. On the other hand, that project didn’t use capabilities for access control, so they had no other option. Marc Stiegler came up with a different solution for a peer to peer file sharing project we worked on a while back. It may not be appropriate here, but I’ll outline it anyway. You set up a synchronizer when a file is shared and give a capability to the synchronizer to the delegatee. When the file is first shared or updated, the synchronizer sends the contents. If the file is shared read/write, the synchronizer accepts updates. You revoke write access by telling the synchronizer to stop accepting updates. Revocation of all permissions is done by deleting the synchronizer. In our implementation, chained delegation involved a chain of synchronizers, but I don’t think that’s necessary here.

ianconsolata commented 2 years ago

@csuwildcat any thoughts on this approach?

andorsk commented 1 year ago

@csuwildcat following up here. We need the spec to update so we are clear about where things are w.r.t. permission/records layer.

andorsk commented 1 month ago

Discussed on DIF: Most of these paths are supported. Motion to close this issue.