acdh-oeaw / rs-customisation

MIT License
0 stars 1 forks source link

Automatize Flushing and Repopulating the Triple Store #39

Open vronk opened 7 months ago

vronk commented 7 months ago

@lu-pl, create an automation (script, github action) to wipe and repopulate the triple store with some or all data. (Discuss with @dpancic)

To consider:

lu-pl commented 7 months ago

An option regarding automated ingest of data graphs would be this: Set up a human + machine readable (YAML) registry file which tracks data graphs.

Something like:

  graph:
    - source: https://raw.githubusercontent.com/lu-pl/clscorgi/main/clscorgi/output/rem/rem.ttl
    - graph_id: https://clscor.io/entity/graph/rem
    - mime: application/turtle

A script could then read the registry file and send update requests to the store.

Benefits of this approach would be

lu-pl commented 7 months ago

As mentioned elsewhere I came up with the following metadata schema for named graphs:

<graph_uri> a rdfg:Graph, sd:NamedGraph, crmdig:D9_Data_Object .

[a crmdig:D10_Software_Execution] crmdig:L11_had_output <graph_uri> ;
    crm:P82_begin_of_the_begin "<time value>" ;
    crmdig:L23_used_software_or_firmware [
        a crmdig:D14_Software ;
        P1_is_identified_by [
            a crm:E42_Identifier ;
            crm:P190_has_symbolic_content <script_uri>
        ] ;
    crmdig:L12_happened_on_device [
      a crmdig:D8_Digital_Device ;
        crm:P129i_is_subject_of [
            a crm:E73_Information_Object ;
            crm:P2_has_type
            <https://vocabs.sshopencloud.eu/browse/media-type/en/page/applicationslashjson> ;
            crm:P190_has_symbolic_content "<json system info>."
        ]
  ]
lu-pl commented 7 months ago

Example YAML registry:

graphs:

  # ttl
  - source: https://raw.githubusercontent.com/lu-pl/clscorgi/main/clscorgi/output/rem/rem.ttl
    graph_id: https://clscor.io/entity/graph/rem
    mime: application/turtle

  # multiple ttl to a single named graph
  - source: [
    https://raw.githubusercontent.com/lu-pl/clscorgi/main/clscorgi/output/eltec/eltec_cze.ttl,
    https://raw.githubusercontent.com/lu-pl/clscorgi/main/clscorgi/output/eltec/eltec_deu.ttl
    ]
    graph_id: https://clscor.io/entity/graph/eltec
    mime: application/turtle

  # trig
  - source: <trig source>
    mime: application/trig
vera-charvat commented 7 months ago

.. adding some notes (not consolidated, sorry) from CLS Infra JFx discussion (2024-03-13) featuring input from Dalibor Pančić, Bernhard Oberreither und Daniel Elsner: