os-climate / os_c_data_commons

Repository for Data Commons platform architecture overview, as well as developer and user documentation
Apache License 2.0
20 stars 10 forks source link

Open Questions on Implementing Data Mesh & Open Questions on User Roles, Data Access, Data Mgmt (types of data - geospatial, etc) #312

Open HeatherAck opened 1 year ago

HeatherAck commented 1 year ago

Solutions Architect Questions for Implementation of OS-Climate's Data Mesh

HeatherAck commented 1 year ago

Open Questions based on Technology - OS-C Solutions Architect

Data Mesh Implementation Questions

As the highest need is for an operational instance the decision is use the current data commons 1.0 as a stable environment as that has been tested Data exchange being the first real app to use this with a customer will migrate to the this instance Mikhail will create a new cluster for Data exchange to promote to a higher environment (Need to define) Data exchange will provide use cases to create a security profile

For the data mesh we dont want to diverge , but we need a strategy to get there .
Data mesh need to provide a roadmap which is in context with all the OSC offering. However when we migrate to this it has to be tested. Mikhail will connect with Vincent to establish a roadmap.

Data Exchange

Open MetaData

Trino

Data Storage

User / Access Management Questions

Keycloak Discussion points: What is the identity/authentication provider of record? (i.e. not GitHub, which is currently used). Authorization is split amongst several technologies: KeyCloak, Trino, OpenMetadata, other? How can we apply Role-Based and Attribute-based permissions (RBAC) across Trino, OM, KeyCloak, Phys Risk clients

Namespace

DevOps

Data Access

Permissions & Roles

Security

HeatherAck commented 1 year ago

@mbogoevici can you please help define the answers to the component questions?

HeatherAck commented 1 year ago

The ArgoCD plugins now working, but having issues with Fybrik and Pachyderm are showing degraded. Take a look at events in ArgoCD. Will look at openshift console for any issues - look at objects; CLI can use kube describe. After fix the degraded issues, next run app of apps for airflow, jupyter, etc. set up S3, SSO - focus on these steps for week of 26-June

HeatherAck commented 1 year ago

Resolved most of ArgoCD issues - operator pachyderm not working, need to install group. rebuilding elyra images. still need to set up time with @mbogoevici - @ryanaslett to set up time with him on 31-Jul.

HeatherAck commented 1 year ago

See also https://github.com/opendatahub-io-contrib/data-mesh-pattern/issues/83 https://github.com/opendatahub-io-contrib/data-mesh-pattern/issues/82 https://github.com/opendatahub-io-contrib/data-mesh-pattern/issues/80 https://github.com/opendatahub-io-contrib/data-mesh-pattern/issues/79

HeatherAck commented 1 year ago

@jpaulrajredhat to update the opendatahub issues today. @redmikhail resending url to @jpaulrajredhat https://console-openshift-console.apps.osc-cl4.apps.os-climate.org/

jpaulrajredhat commented 1 year ago

@HeatherAck who should be contact person for cluster 4 installation. Looks like tools are partially installed and some of the components are in failed state .

HeatherAck commented 1 year ago

@ryanaslett (and @grigarr his manager) are the primary key contacts.

jpaulrajredhat commented 1 year ago

@HeatherAck Could you please setup meeting with @ryanaslett . I need to understand how far installation completed and which components / tools failed. I can see the tooling installation itself some of the components failed due space issue.

HeatherAck commented 1 year ago

@jpaulrajredhat - he is free any day after 11AM PT - what works best for you