Open juergenhemelt opened 2 years ago
@mandy-chessell @planetf1 fyi
I am not sure how this is progressing but here are some thoughts ...
There are many description of data products made by different vendors and thought leaders. Some are focused on the technical implementation/deployment, others are more focused on the organizational/governance aspects of service level agreements/licensing/ownership aspects.
Each of these perspectives may be a valid focus for an organization at a particular point in time. Therefore I would propose that the data product is represented as a DataProduct classification that can be attached to any referenceable. This means it could be attached to a data set/API type asset, a server/container deployment or may be a more architectural/business construct that is attached to a solution component or digital service.
Over time as an organization refines their definition of a data product, the classification could be moved to a higher level concept to cover a more complete definiton of the data product.
I have just updated the descriptions of digital services, information supply chains and solution component in the Area 7 types description since that are relevant for the more complete view of a data product.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.
Here are suggested mappings from data product concepts to Egeria's open metadata types:
Data Product concept | Egeria open metadata types (with links) |
---|---|
Data Domain | Data domains are represented by SubjectAreaDefinition entities. The SubjectArea classification is used to tag elements from the subject area. |
Data Product Manager | The data product manager role is typed by the DigitalServiceManager. They have the business ownership of a collection of related data products represented by a DigitalService. Data products are grouped under a single digital service when they make use of similar processing. For example, they may use the same data, but formated, scoped or processed differently with different licenses. |
Data Product | Each data product is identified by the DigitalProduct classification. The productType attribute can be used to identify the digital product as a data product. |
Data Product Design | The design of the data products' manufacturing and maintenance pipelines, along with the data products' storage and delivery mechanisms are represented by the digital service's SolutionBlueprint linked to SolutionComponents. The DigitalProduct classification is added to the solution components that represent the data product delivery capability. |
Data Product Implementation | The manufacturing/maintenance solution components are linked to the appropriate data pipeline Processes using the ImplementedBy relationship. The data product's delivery solution components are also linked to the delivery data assets via the ImplementedBy relationship. |
Data Product Specification | There are many types of information that make up the data product specification. Different organizations will make there own choices, but here are some options. They can be linked to the solution components or data assets depending on how specific the information is:
|
Data Product Subscription | A subscriber (person, organization, system, ...) can register with the marketplace using a DigitalSubscription. The different products selected by the subscriber are attached to the digital subscription via the AgreementItem relationship. Terms and Conditions can be added to the DigitalSubscription using the AttachedTermsAndConditions relationship. Overrides to the terms and conditions can be added to the AgreementItem relationship. |
Is there an existing issue for this?
Please describe the new behavior that that will improve Egeria
The Egeria metamodel should include items for the description of Data Products in the context of a Data Mesh (https://martinfowler.com/articles/data-mesh-principles.html). There are existing developments and suggestions of how to do that. You can find some ideas here:
https://arnerossmann.github.io/post/2022-02-09_metadata-dataproduct/ https://github.com/agile-lab-dev/Data-Product-Specification
Alternatives
Using OpenMetadata (https://docs.open-metadata.org) instead of Egeria as suggested here https://github.com/agile-lab-dev/Data-Product-Specification
Any Further Information?
No response
Would you be prepared to be assigned this issue to work on?