decentralized-identity / decentralized-web-node

Decentralized data storage and message relay for decentralized identity and apps.
https://identity.foundation/decentralized-web-node/spec/
Apache License 2.0
402 stars 79 forks source link

IPFS in Protocol Stack #179

Open WilliamTheHippo opened 2 years ago

WilliamTheHippo commented 2 years ago

The current list of component layers in §5 lists all of the necessary components to implement DWN functionality, and the lowest level reads simply "IPFS". While I have no doubt that IPFS is the most appropriate way to host a DWN, they can also be hosted on centralized servers, torrent files, etc.

Maybe change "IPFS" to "Data Storage" and add a supplemental note below that IPFS is the recommended system for data storage? Or maybe even a section/appendix of recommended systems to implement each component in the stack, which would be more extensible. (We could recommend encryption standards, schemas, things for other components of the stack as well.)

csuwildcat commented 2 years ago

It should read IPLD Blockstore, as that's the only actual reliance within the spec.

wyc commented 2 years ago

+1 IPLD

mweel1 commented 2 years ago

I agree the storage layer should be an abstraction (I think), It seems requiring a decentralized approach of the data store really becomes pointless when it sits behind this services, or am I missing some other extensibility of the data store outside of the context of this service?

csuwildcat commented 2 years ago

I agree the storage layer should be an abstraction (I think), It seems requiring a decentralized approach of the data store really becomes pointless when it sits behind this services, or am I missing some other extensibility of the data store outside of the context of this service?

I don't fully understand what you mean by 'this service'. Each instance of your DWeb Node personal datastore is a masterless clone, and the fact you may choose to have an instance remote of your devices, in addition to those on your devices, doesn't make the system centralized or a service in any typical sense.

mweel1 commented 2 years ago

I personally would not like to see the "back-end" implementation of this standard in the specification. People should have a choice of the implementation if its going to be replicated, centralized, or whatever. The specification should just have the contract the service accepts and responds to IMOH.

If web hooks (which I think are required) are added, there are going to be a host of design decisions that are going to have to be made around local storage, queuing, which node is handling the messaging (its masterless?) etc. Do you really want to get into all of that, or just create the contract to accept a subscription, and what to expect on a publication?

I guess my point is, I don't want lock-in as it relates to the backend. Is that really required for this to be a success? Is this a standard or a product? If it's both, I would suggest separating them.

agropper commented 2 years ago

The data store should absolutely be abstracted out of the search and control components so that they may be separately assorted. Otherwise we have an unnecessary lock-in situation as well as a violation of data minimization fair information practice principles.

In cases where a data store (or other processor) does not recognize the authority of the resource owner as controller, the resource owner is forced to copy the data to a store that does recognize their choice of controller. This is NOT privacy by default since it forces the resource owner to choose between sharing their authorization policies with the data store operator or make a copy of the data to somewhere that does respect the resource owner's choice of controller. Either way, there is a violation of data minimization.

The essence the Patient Privacy Rights position was discussed in a panel at Identiverse. https://identiverse.com/idv2022/session/841489/ Here's the key slide describing separation and separate assortment between the authentication, authorization (policy), and persistence layers of a decentralized architecture.

https://docs.google.com/document/d/1gH1HVvOpJqLkg8BBbDCWh9SclDnJnztvd7x_YVVhtsw/edit

On Wed, Jul 6, 2022 at 2:04 PM Mardo @.***> wrote:

I personally would not like to see the "back-end" implementation of this standard in the specification. People should have a choice of the implementation if its going to be replicated, centralized, or whatever. The specification should just have the contract the service accepts and responds to IMOH.

If web hooks (which I think are required) are added, there are going to be a host of design decisions that are going to have to be made around local storage, queuing, which node is handling the messaging (its masterless?) etc. Do you really want to get into all of that, or just create the contract to accept a subscription, and what to expect on a publication?

I guess my point is, I don't want lock-in as it relates to the backend. Is that really required for this to be a success?

— Reply to this email directly, view it on GitHub https://github.com/decentralized-identity/decentralized-web-node/issues/179#issuecomment-1176521776, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABB4YMRCUHBH4FLPBEEO73VSXDBDANCNFSM52Z4O3OQ . You are receiving this because you are subscribed to this thread.Message ID: <decentralized-identity/decentralized-web-node/issues/179/1176521776@ github.com>

mweel1 commented 2 years ago

I am just getting my feet wet here, so let me start from the beginning.

A user would set up one of these decentralized web node (DWN) that provides functionality around mutating one or many data sources by the way of exposed services. That is the job of the a decentralized web node (DWN) correct?

Are you saying the security context between the DWN and the data would not be shared, I was not going that far with it.

agropper commented 2 years ago

Depends on what you mean by "security context".

GDPR, Zero-Trust and most other current security practice treat the distinction between data controller and data processor as fundamental. Adding "decentralized" to an otherwise invented name like Fred or Web Node does not alter the reality that making an actor play both controller and processor role is not recommended practice from either a privacy (GDPR) or security (ZTA) perspective.

On Wed, Jul 6, 2022 at 3:21 PM Mardo @.***> wrote:

I am just getting my feet wet here, so let me start from the beginning.

A user would set up one of these decentralized web node (DWN) that provides functionality around mutating one or many data sources by the way of exposed services. That is the job of the a decentralized web node (DWN) correct?

Are you saying the security context between the DWN and the data would not be shared, I was not going that far with it.

— Reply to this email directly, view it on GitHub https://github.com/decentralized-identity/decentralized-web-node/issues/179#issuecomment-1176587483, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABB4YN3B3U6SW3W3FVOOETVSXMC3ANCNFSM52Z4O3OQ . You are receiving this because you commented.Message ID: <decentralized-identity/decentralized-web-node/issues/179/1176587483@ github.com>

csuwildcat commented 2 years ago

I'll say again that the spec only mandates the IPLD conventions around the way data is chunked and CID'd, not the way you actually have to store the bytes. If you don't at least normatively define the way data is assembled (e.g. canonicalized and identified with CIDs) you wouldn't be able to have tap different instances be interoperable with each other.

agropper commented 2 years ago

Data models like chunking and CID'd as well as DID and VC are essential for interop but separate from protocols and the roles they enable. The roles of various actors are enabled by the protocols and that's what impacts privacy and decentralization.

On Wed, Jul 6, 2022 at 5:14 PM Daniel Buchner @.***> wrote:

I'll say again that the spec only mandates the IPLD conventions around the way data is chunked and CID'd, not the way you actually have to store the bytes. If you don't at least normatively define the way data is assembled (e.g. canonicalized and identified with CIDs) you wouldn't be able to have tap different instances be interoperable with each other.

— Reply to this email directly, view it on GitHub https://github.com/decentralized-identity/decentralized-web-node/issues/179#issuecomment-1176754583, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABB4YLOLBOPX6JQJBUYQULVSXZLTANCNFSM52Z4O3OQ . You are receiving this because you commented.Message ID: <decentralized-identity/decentralized-web-node/issues/179/1176754583@ github.com>

mweel1 commented 2 years ago

GDPR, Zero-Trust and most other current security practice treat the distinction between data controller and data processor as fundamental. Adding "decentralized" to an otherwise invented name like Fred or Web Node does not alter the reality that making an actor play both controller and processor role is not recommended practice from either a privacy (GDPR) or security (ZTA) perspective.

Why?

mistermoe commented 2 years ago

@mweel1 the storage layer is an abstraction. The JS reference implementation provides a MessageStore interface that defines all of the method signatures needed to perform storage, retrieval, and deletion of messages. The motivation behind providing this interface is to enable developers to use whichever underlying storage technology that best fits their needs/usecase e.g. mysql, mongo, LevelDB, cockroachDB etc.

mweel1 commented 2 years ago

@mistermoe great! It sounds like there is a gap in the spec and the code then. I believe this is the right approach.

csuwildcat commented 2 years ago

Resolved in latest commit via restating that the base dependency is only IPLD multiformats/codecs: https://identity.foundation/decentralized-web-node/spec/#protocol-stack

csuwildcat commented 2 years ago

Tagging pending close unless there are other concerns raised about the language change.

oed commented 2 years ago

"IPLD multifirmats" is not very precise. Are you using the IPLD data model and blockstore approach then multiformats is implicit. If you are using only multiformats for hashing data and creating CIDs (e.g. using the raw codec or similar) then you are not really using IPLD.

csuwildcat commented 2 years ago

What the spec uses:

If you know a couple better overarching words to capture that for the diagram, then I think folks would be fine to change it.

On Thu, Jul 7, 2022, 10:19 AM Joel Thorstensson @.***> wrote:

"IPLD multifirmats" is not very precise. Are you using the IPLD data model and blockstore approach then multiformats is implicit. If you are using only multiformats for hashing data and creating CIDs (e.g. using the raw codec or similar) then you are not really using IPLD.

— Reply to this email directly, view it on GitHub https://github.com/decentralized-identity/decentralized-web-node/issues/179#issuecomment-1177779782, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABAFSTROAR36FKTV7SOLYTVS3YQFANCNFSM52Z4O3OQ . You are receiving this because you commented.Message ID: <decentralized-identity/decentralized-web-node/issues/179/1177779782@ github.com>

oed commented 2 years ago

@csuwildcat I would just say IPLD in this case! Also your attestation format is compatible with DagJOSE (which now is fully supported in go-ipfs) so using that will make your life easier when traversing DAGs.