zazuko / barnard59

An intuitive and flexible RDF pipeline solution designed to simplify and automate ETL processes for efficient data management.
26 stars 2 forks source link

[rdf] Gitlab Prov-o-metadata step #134

Open cristianvasquez opened 1 year ago

cristianvasquez commented 1 year ago

Description: A prov-metadata step that produces lineage triples from Gitlab environment variables.

Motivation: For specific datasets, one would like to have links to the pipeline runs, know the name of the person that worked on the code, the Barnard pipeline itself, etc.

When a pipeline is running, one has access to the following variables: https://docs.gitlab.com/ee/ci/variables/predefined_variables.html

One would want to add metadata related to the pipeline run. The vocabulary to model this is https://www.w3.org/TR/prov-o/.

Other related resources:

https://github.com/DLR-SC/gitlab2prov https://github.com/DLR-SC/Gitlab2Graph

cristianvasquez commented 1 year ago

Related: https://academic.oup.com/gigascience/article/8/11/giz095/5611001?login=false

cristianvasquez commented 1 year ago

Pull request: https://github.com/zazuko/barnard59-rdf/pull/31