os-climate / os_c_data_commons

Repository for Data Commons platform architecture overview, as well as developer and user documentation
Apache License 2.0
18 stars 10 forks source link

Need Elyra pipeline documentation #82

Open MichaelTiemannOSC opened 2 years ago

MichaelTiemannOSC commented 2 years ago

As a data ingester, I want to ingest the WRI GPPD data and then pass metadata to a metadata update process. But I don't know the best way to pass my specific metadata (schema name for WRI GPPD, tables I create as a result of ingestion, and information about the fields of those tables) to a generic metadata upload process.

Please update https://github.com/os-climate/os_c_data_commons/blob/main/docs/create-processing-pipeline.md

caldeirav commented 2 years ago

Generic documentation and examples of pipelines to refer to: https://github.com/elyra-ai/examples

eoriorda commented 2 years ago

Need to re-create the pipeline again . Consolidate all the tools into one place incorporating all the new changes from Eric. Document an end to end solution to rebuild the pipeline.

eoriorda commented 1 year ago

@erikerlandson Is this something you are actively working on do you need clarity .

caldeirav commented 1 year ago

Moved back to the backlog as we need to consider which pipeline automation tool we want to adopt (Elyra / Airflow / Kubeflow).

HeatherAck commented 1 year ago

@caldeirav is this a question for the TAC or a tactical one for DC TSC? Thanks,

caldeirav commented 1 year ago

There has to be a technical discussion at the level of the Data Commons stream before we could determine this. Likely the TAC is required only if we are looking at a fundamental change in approach, but not just for a choice of tooling.