oceanprotocol / pm

Zenhub needs each issue associated with one repo. This repo is a workaround, to mark issues that span >1 repos.
4 stars 0 forks source link

Data encryption #193

Closed alexcos20 closed 6 months ago

alexcos20 commented 1 year ago

Some use cases require data to be encrypted, especially when stored on public decentralized infrastructure like IPFS or Arweave (because anyone can read it).

Let's fix that:

trentmc commented 1 year ago

Right now people can share encrypted data. Just that encryption & decryption has to be done client-side. And that's ok for many use cases! We shouldn't ignore this.

I worry about Provider doing yet more work. It means that people have to trust the Provider runner that much more; and it means more complexity in the Ocean stack. I'd prefer to avoid both, if possible.

There are many possible flows on encrypted data. For a specific flow, one needs to work out:

  1. is encryption / decryption symmetric or asymmetric?
    • if symmetric, how is the encryption key shared?
    • if asymmetric, how does the encrypting party get the public key to encrypt in?
  2. where is the data? It could be (a) on-chain via metadata / Aquarius, (b) on-chain via ERC725, (c) on-chain via urls (eg Arweave), or (d) off-chain via urls (eg Filecoin, web2 storage, ..).
  3. is the encryption in GUI, JS, or Py? Is the decryption in GUI, JS, or Py?

The ocean.py README "Private Sharing of On-Chain Data" has examples for (1) both symmetric & asymmetric encryption, (2) for when data is on-chain via ERC725, (c) both encryption & decryption in Py

So if any new work is done for this issue, I recommend that we identify which specific flow we'd like to support better. That is: answer the questions (1)(2)(3) above.

alexcos20 commented 1 year ago

@trentmc - your flows are only working when you know who is buying the asset. But this is a very very narrow case. For a regular publisher, this does not apply, because he/she does not know in advance who will buy the asset.

As publisher, I have the following concerns:

Possible ways to encrypt when buyer is not known:

alexcos20 commented 1 year ago

The ideal case would be to have this sharing key mechanism in datatoken contract. IE:

Unfortunately, in solidity, I don't think is possible for now. Maybe a combination of on-chain/off-chain components can do the trick, but this is still in a design phase

trentmc commented 1 year ago

ideal case would be to have this key sharing mechanism in datatoken contract

Actually you can do it with threshold cryptography. Eg Lit Protocol. The key gets sharded up across nodes in Lit Network, then reconstructed for the user when needed.

I bet you could prototype this in a matter of hours

your flows are only working when you know who is buying the asset. But this is a very very narrow use case

Actually the flows work across a super broad spectrum. Example uses:

While broad, the above cases do not support the "data marketplace" use case, where buyer doesn't know seller (as you point out). It would be useful to support that flow, and the broader version of that flow which is all around datatokens.

Ideally we support that via Lit. (Perhaps this is the perfect time to introduce a third datatoken template that does exactly this. Not only for encrypt / decrypt of data, but of url itself. Two birds one stone. I believe it's less work than one think.)

The other option to support this flow would be like you suggested in the description - to entrust Provider with the keys to encrypt / decrypt.

My recommendation is to try the Lit approach first. It's more trustless, and where we want to be eventual anyway.