ethereum-oasis-op / baseline-standard

Repository for the Baseline standards team and specification work
Creative Commons Zero v1.0 Universal
18 stars 33 forks source link

CORE Spec - Schema Management Component - #26

Closed Consianimis closed 3 years ago

Consianimis commented 3 years ago

Brainstorm:

Should the standard address requirements that ensure that counterparties send and receive data as per initial 'agreed' schema or should this be out of scope?

Is there a need for a Baseline Protocol schema management framework?

Would those be the requirements of a Baseline-connector? A Baseline-connector is an interface connecting and synchronizing a baseline stack and system of record.

jack-garvin commented 3 years ago

Related note, a setup process is necessary for the zero knowledge proofs to be generated for each event in a workflow. The orchestration of the setup is a one time process. There is only one shared, agreed data point at the inception of a workflow. All other counterparty data will be subject to GDPR constraints and inter-party contractual requirements.

The DID phonebook will contain some shared data. Additional data may be required for executing smart contracts.

Should this 'agreed' data be stored or not in a schema? Maybe.

Will the agreed data be altered in the future? No.

Consianimis commented 3 years ago

@jack-garvin - if two parties have different data models for 'invoices' in their respective systems. Do you think it should be in scope of the protocol specification to state the requirements for : 1) agreement on a 'common' data model as a pre-requisite to the baselining process 2) mechanisms to manage the various agreed 'data models' from within the baseline stack

Or could it be a preferable approach to state this is a part of the integration that should be left for parties to handle.

jack-garvin commented 3 years ago

Anais, thanks for including me! I went a little long on this one, so bear with me.  Q1 - prerequisite - yes. The protocol should not require any data that does not directly support the process/workflow/transactions. The 'minimum' viable ecosystem should have minimal data requirements.  Similar to the security axiom of the 'minimum required access' at each step of the process. Of course, the data model just needs to be extensible for future revisions.  Also, I like the notion of using data graphs for oracles. Q2 - internal data management - yes. Any required data, not just minimal data, should be fully accessible (CRUD) at least to the baseline administrator. Don't forget that the client needs to be able to eradicate all of their data when they exit the baselined environment (workflow). The client needs to be able to make data available publically or limited external availability, as necessary. So, there is a need for public-facing data management as well (public, private, and internal data). Example: For registration purposes, you wouldn't need a sales tax authority unless it is required for your KYC compliance. KYC would be a micro-service requiring specific data. You may not want KYC on every workflow partner.   Phone book registration is where data management begins. I think we are going to have to offer a hierarchy of increasing capabilities: basic registration and minimal baseline data requirements all the way to super-deluxe 'everything-as-a-service'.  Something like - Minimal, Basic two-party, Multi-party, and Enterprise levels. Each would carry a data specification to support baselining relative to the number of parties involved in the workflow - me, us, them [1,2,n] What is the minimum data required for registration? ID and password (DID and private key). Everything else is going to be specific to the workflow data requirements. 'Baselining' is a process that occurs after registration. What is the minimum data needed to 'begin baselining', but not actually 'being baselined'.  I know this is obvious, but don't require data unless it is absolutely necessary. For every 'required' data element, you need additional support and verification schemes to maintain the data. Always start with a minimum implementation, additional data requirements get defined along the way. As baselining processes become normalized through network growth, you will be adding to the minimal spec (and all levels) as you get mass adoption in the marketplace.

In theory, if not practice, I could set up an elaborate workflow that only involves one entity. Actually, you would model it as if the blockchain and oracles as the other entity.  Furthermore, I should be able to set up these entities:1) an 'anonymous' entity (incognito), the ability to include a workflow partner that requires 'absolute anonymity'. 2) an entity that serves as a placeholder for future workflow partners to support the modeling of workflows. Actual workflow partners would assume these 'placeholder' entities as the partners get registered. 3) an entity can use aliases to support the modeling of workflows. This is similar to #2 but instead of setting up live dummy accounts, only an alias is needed from the original registrant.  On Monday, March 15, 2021, 08:42:19 AM CDT, Anaïs Ofranc @.***> wrote:

@jack-garvin - if two parties have different data models for 'invoices' in their respective systems. Do you think it should be in scope of the protocol specification to state the requirements for :

Or could it be a preferable approach to state this is a part of the integration that should be left for parties to handle.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

jack-garvin commented 3 years ago

Assign to me, please. I have a team ready to tackle this one.

mehranshakeri commented 3 years ago

There are different topics discussed here:

My questions:

  1. What/Who is the Baseline administrator? Role specifications?
  2. Can we minimize the scope? It sounds like the scope is how to define the handshake protocol and interpret a schema?
  3. Can we have some visualization for this? Even links to existing documents would be helpful.
Consianimis commented 3 years ago

@mehranshakeri - Can you take the lead on the write up for this component by opening a PR that the team can review? You can pair with @jack-garvin . Last week we updated the ways of working. Please see issue #46 - Way of working.

mehranshakeri commented 3 years ago

@Consianimis Sure.

jack-garvin commented 3 years ago

Since I added a little confusion to the topic, let me take another pass. Bottom line - WorkFlow participants have to manage their own L2 tech stack, i.e Baseline Administration.

The role responsible for maintaining the operating WorkFlow for the entity is the Baseline Administrator. This could be outsourced or incorporated into the current supply chain management team. It could be more than one person.

As we move through the Protocol project we will discover new constraints every day. Adjustments and modifications are part of the process. For starters, we have to make some ASSUMPTIONS about the WorkFlow and WorkGroup setup/environment.

The objective of WorkFlow is to certify/verify invoices that arise from the workflow. So, let's ASSUME there are three Parties (DIDs) in one WorkGroup with one WorkFlow assigned to the WorkGroup. Then, the Global Phone Book will have three entrees. These are the pieces that need to be created to support this scenario: 1 workgroup, 1 workflow, 3 DIDs, 1 Global Phone-Book, and 1 invoice asset token.

In addition to managing these data structures, you will have 'connectors' for each participant's system-of-record (SOR) integration. Internal to the process is event messaging and data payload handling - message queues. State trees are used to enforce event sequencing. More data management: connectors, messages+payloads, and event state tree.

So, Schema Management encompasses all of the above.

kthomas commented 3 years ago

Since I added a little confusion to the topic, let me take another pass. Bottom line - WorkFlow participants have to manage their own L2 tech stack, i.e Baseline Administration.

No. There are many form factors that a baselined organization might take. The "lightest nodes" in the baseline network are not going to have to carry the burden of running their own L2 stack.

Also, I disagree with the "certifying invoices" objective being stated. This is of course an important case but we cannot base the protocol/spec on something that lives in what would previously have been considered Layer 7.

I agree that 'connectors' will do the things you suggest. Who runs this infrastructure depends on the size of the organization, its role in the ecosystem and its desire to participate...

About assumptions, I think we should not make assumptions when we actually have real data to leverage. In the case of invoices, we have the luxury of an actual customer case :)

kthomas commented 3 years ago

There are different topics discussed here:

  • Something like a handshake phase to learn about supported data models and their versions to agree on a common shared one (e.g. used for ZKP workflow)

An initial form of this handshake was introduced in the bri-1-privacy branch and is documented as part of the API specification. Check out the JOIN and SYNC opcodes on this branch.

  • Minimum data for registration
  • Phonebook
  • Different ways to setup an entity in a workflow

My questions:

  1. What/Who is the Baseline administrator? Role specifications?
  1. Can we minimize the scope? It sounds like the scope is how to define the handshake protocol and interpret a schema?
Therecanbeonlyone1969 commented 3 years ago

@mehranshakeri @jack-garvin cc @Consianimis @kthomas

Could you please provide the relevant use case descriptions, with definitions, workflows, and business requirements in a structured format which people can comment on specifically. This discussion is very confusing.

@Consianimis I am in favor of closing this issue unless a concise description of relevant use case descriptions, definitions, workflows, and business requirements as is customary in standards work is provided by 4/29, or, of course, a direct standards contribution of requirements with accompanying text in the form of a PR.

jack-garvin commented 3 years ago

Brainstorm - Anais opened this issue with a request to brainstorm.

If brainstorming is too confusing just let the people who are not confused continue discussing, listen and catch up.

jack-garvin commented 3 years ago

Can you point me to the BRI-1 requirements documentation?

Therecanbeonlyone1969 commented 3 years ago

@jack-garvin I am looking forward to review the PR

jack-garvin commented 3 years ago

There are different topics discussed here:

  • Something like a handshake phase to learn about supported data models and their versions to agree on a common shared one (e.g. used for ZKP workflow)

KT- An initial form of this handshake was introduced in the bri-1-privacy branch and is documented as part of the API specification. Check out the JOIN and SYNC opcodes on this branch.

  • Minimum data for registration
  • Organization name

KT- " * secp256k1 keypair (if Ethereum mainnet) -- "!! It is of critical importance that this never signs any transactions that are connected to a workflow!"

  • Phonebook

KT- * OrgRegistry contract + DIDs; this more or less exists today and works but will certainly need some thoughtful iteration...

  • Different ways to set up an entity in a workflow

My questions:

  1. What/Who is the Baseline administrator? Role specifications?

KT- * What do you mean by "administrator"? I have never heard of that before. JG- That's funny. I just defined it up above. It is a generic term for 'operations' management of the WF, WG, etc. Somebody has to keep the lights on. What term have you been using for this role?

  1. Can we minimize the scope? It sounds like the scope is how to define the handshake protocol and interpret a schema?

KT- * It's quite basic as defined today in core packages and BRI-1. Why not just use the existing, working pattern? :) JG- It is very simple in BRI-1. You mentioned the need for some thoughtful iteration. What differences do you see?

Since I added a little confusion to the topic, let me take another pass. Bottom line - WorkFlow participants have to manage their own L2 tech stack, i.e Baseline Administration.

KT- No. There are many form factors that a baselined organization might take. The "lightest nodes" in the baseline network are not going to have to carry the burden of running their own L2 stack. JG- We are discussing Schema Management of the CORE. If these participants use the 'lightest nodes' they still have a node to manage. I have to assume that there is a requirement being met by these 'light nodes'. Those requirements will have to be enforced by L2.

KT- > Also, I disagree with the "certifying invoices" objective being stated. This is of course an important case but we cannot base the protocol/spec on something that lives in what would previously have been considered Layer 7. JG- I was assuming we are on the same page - my fault. Which use-case should we be discussing? Andreas made it clear that we need to start with a use-case. I'm just repeating what I have heard. JohnW repeats this point often. It's just an example, not a standard.

KT- > I agree that 'connectors' will do the things you suggest. Who runs this infrastructure depends on the size of the organization, its role in the ecosystem, and its desire to participate... JG- Participants choose who runs the L2 infrastructure as part of L2 setup. Connectors are a micro-service run on full nodes.

About assumptions, I think we should not make assumptions when we actually have real data to leverage. In the case of invoices, we have the luxury of an actual customer case :) JG- see above

jack-garvin commented 3 years ago

@jack-garvin I am looking forward to review the PR

I think you meant @mehranshakeri. I will not be making PRs.

mehranshakeri commented 3 years ago

Here is the PR #52, I hope it helps to organize our brainstorm discussion :)