zazuko / cube-creator

A tool to create RDF cubes from CSV files
GNU Affero General Public License v3.0
12 stars 2 forks source link

fix: use predefined shapes to query shared dimensions #1357

Closed tpluscode closed 1 year ago

tpluscode commented 1 year ago

Big change to how shared dimensions and hierarchies are stored in the database

Currently, every save resource is saved alongside a SHACL shape to describe its properties. This proved difficult to query and manage in runtime. Additionally, all blank nodes would have been replaced with urn: identifiers which only added to the code necessary to process the API calls.

As a result, some queries became slow and the the shared dimensions graph bloated with excess triples.

To mitigate that, the process is being simplified so that the resources will be saved with blank nodes intact and retrieved using predefined shapes for specific typed (dimension, dimension term, hierarchy, etc)

A one-time data cleanup is advised.

First, create a backup of the graph <https://lindas.admin.ch/cube/dimension>. Just in case

Use the queries below to export only the important data. They are similar to queries the API will run but also ensure that urn: identifiers are changed to blank nodes.

Export Dimensions ```sparql PREFIX rdf: PREFIX schema: PREFIX sh: PREFIX qudt: PREFIX meta: PREFIX time: PREFIX hydra: PREFIX rdfs: CONSTRUCT { ?resource rdf:type . ?resource rdf:type ?resource_0_n . ?resource schema:name ?resource_1_n . ?resource schema:validFrom ?resource_2_n . ?resource schema:validThrough ?resource_3_n . ?resource schema:alternateName ?resource_4_n . ?resource sh:property ?resource_5_n . ?resource_5_n schema:name ?resource_5_0_n . ?resource_5_n qudt:scaleType ?resource_5_1_n . ?resource_5_n meta:dataKind ?resource_5_2_n . ?resource_5_2_n rdf:type ?resource_5_2_0_n . ?resource_5_2_n time:unitType ?resource_5_2_1_n . ?resource schema:additionalProperty ?resource_6_n . ?resource_6_n hydra:required ?resource_6_0_n . ?resource_6_n ?resource_6_1_n . ?resource_6_n sh:class ?resource_6_2_n . ?resource_6_n rdf:predicate ?resource_6_3_n . ?resource_6_n schema:multipleValues ?resource_6_4_n . ?resource_6_n rdfs:label ?resource_6_5_n . ?resource_6_n sh:datatype ?resource_6_6_n . ?resource_6_n sh:languageIn ?resource_6_7_n . } FROM WHERE { ?resource rdf:type . { ?resource rdf:type ?resource_0 . bind(IF(STRSTARTS(str(?resource_0), 'urn'), BNODE(str(?resource_0)), ?resource_0) as ?resource_0_n) } UNION { ?resource schema:name ?resource_1 . bind(IF(STRSTARTS(str(?resource_1), 'urn'), BNODE(str(?resource_1)), ?resource_1) as ?resource_1_n) } UNION { ?resource schema:validFrom ?resource_2 . bind(IF(STRSTARTS(str(?resource_2), 'urn'), BNODE(str(?resource_2)), ?resource_2) as ?resource_2_n) } UNION { ?resource schema:validThrough ?resource_3 . bind(IF(STRSTARTS(str(?resource_3), 'urn'), BNODE(str(?resource_3)), ?resource_3) as ?resource_3_n) } UNION { ?resource schema:alternateName ?resource_4 . bind(IF(STRSTARTS(str(?resource_4), 'urn'), BNODE(str(?resource_4)), ?resource_4) as ?resource_4_n) } UNION { ?resource sh:property ?resource_5 . bind(IF(STRSTARTS(str(?resource_5), 'urn'), BNODE(str(?resource_5)), ?resource_5) as ?resource_5_n) } UNION { ?resource sh:property ?resource_5 . bind(IF(STRSTARTS(str(?resource_5), 'urn'), BNODE(str(?resource_5)), ?resource_5) as ?resource_5_n) ?resource_5 schema:name ?resource_5_0 . bind(IF(STRSTARTS(str(?resource_5_0), 'urn'), BNODE(str(?resource_5_0)), ?resource_5_0) as ?resource_5_0_n) } UNION { ?resource sh:property ?resource_5 . bind(IF(STRSTARTS(str(?resource_5), 'urn'), BNODE(str(?resource_5)), ?resource_5) as ?resource_5_n) ?resource_5 qudt:scaleType ?resource_5_1 . bind(IF(STRSTARTS(str(?resource_5_1), 'urn'), BNODE(str(?resource_5_1)), ?resource_5_1) as ?resource_5_1_n) } UNION { ?resource sh:property ?resource_5 . bind(IF(STRSTARTS(str(?resource_5), 'urn'), BNODE(str(?resource_5)), ?resource_5) as ?resource_5_n) ?resource_5 meta:dataKind ?resource_5_2 . bind(IF(STRSTARTS(str(?resource_5_2), 'urn'), BNODE(str(?resource_5_2)), ?resource_5_2) as ?resource_5_2_n) } UNION { ?resource sh:property ?resource_5 . ?resource_5 meta:dataKind ?resource_5_2 . bind(IF(STRSTARTS(str(?resource_5_2), 'urn'), BNODE(str(?resource_5_2)), ?resource_5_2) as ?resource_5_2_n) ?resource_5_2 rdf:type ?resource_5_2_0 . bind(IF(STRSTARTS(str(?resource_5_2_0), 'urn'), BNODE(str(?resource_5_2_0)), ?resource_5_2_0) as ?resource_5_2_0_n) } UNION { ?resource sh:property ?resource_5 . ?resource_5 meta:dataKind ?resource_5_2 . bind(IF(STRSTARTS(str(?resource_5_2), 'urn'), BNODE(str(?resource_5_2)), ?resource_5_2) as ?resource_5_2_n) ?resource_5_2 time:unitType ?resource_5_2_1 . bind(IF(STRSTARTS(str(?resource_5_2_1), 'urn'), BNODE(str(?resource_5_2_1)), ?resource_5_2_1) as ?resource_5_2_1_n) } UNION { ?resource schema:additionalProperty ?resource_6 . bind(IF(STRSTARTS(str(?resource_6), 'urn'), BNODE(str(?resource_6)), ?resource_6) as ?resource_6_n) } UNION { ?resource schema:additionalProperty ?resource_6 . bind(IF(STRSTARTS(str(?resource_6), 'urn'), BNODE(str(?resource_6)), ?resource_6) as ?resource_6_n) ?resource_6 hydra:required ?resource_6_0 . bind(IF(STRSTARTS(str(?resource_6_0), 'urn'), BNODE(str(?resource_6_0)), ?resource_6_0) as ?resource_6_0_n) } UNION { ?resource schema:additionalProperty ?resource_6 . bind(IF(STRSTARTS(str(?resource_6), 'urn'), BNODE(str(?resource_6)), ?resource_6) as ?resource_6_n) ?resource_6 ?resource_6_1 . bind(IF(STRSTARTS(str(?resource_6_1), 'urn'), BNODE(str(?resource_6_1)), ?resource_6_1) as ?resource_6_1_n) } UNION { ?resource schema:additionalProperty ?resource_6 . bind(IF(STRSTARTS(str(?resource_6), 'urn'), BNODE(str(?resource_6)), ?resource_6) as ?resource_6_n) ?resource_6 sh:class ?resource_6_2 . bind(IF(STRSTARTS(str(?resource_6_2), 'urn'), BNODE(str(?resource_6_2)), ?resource_6_2) as ?resource_6_2_n) } UNION { ?resource schema:additionalProperty ?resource_6 . bind(IF(STRSTARTS(str(?resource_6), 'urn'), BNODE(str(?resource_6)), ?resource_6) as ?resource_6_n) ?resource_6 rdf:predicate ?resource_6_3 . bind(IF(STRSTARTS(str(?resource_6_3), 'urn'), BNODE(str(?resource_6_3)), ?resource_6_3) as ?resource_6_3_n) } UNION { ?resource schema:additionalProperty ?resource_6 . bind(IF(STRSTARTS(str(?resource_6), 'urn'), BNODE(str(?resource_6)), ?resource_6) as ?resource_6_n) ?resource_6 schema:multipleValues ?resource_6_4 . bind(IF(STRSTARTS(str(?resource_6_4), 'urn'), BNODE(str(?resource_6_4)), ?resource_6_4) as ?resource_6_4_n) } UNION { ?resource schema:additionalProperty ?resource_6 . bind(IF(STRSTARTS(str(?resource_6), 'urn'), BNODE(str(?resource_6)), ?resource_6) as ?resource_6_n) ?resource_6 rdfs:label ?resource_6_5 . bind(IF(STRSTARTS(str(?resource_6_5), 'urn'), BNODE(str(?resource_6_5)), ?resource_6_5) as ?resource_6_5_n) } UNION { ?resource schema:additionalProperty ?resource_6 . bind(IF(STRSTARTS(str(?resource_6), 'urn'), BNODE(str(?resource_6)), ?resource_6) as ?resource_6_n) ?resource_6 sh:datatype ?resource_6_6 . bind(IF(STRSTARTS(str(?resource_6_6), 'urn'), BNODE(str(?resource_6_6)), ?resource_6_6) as ?resource_6_6_n) } UNION { ?resource schema:additionalProperty ?resource_6 . bind(IF(STRSTARTS(str(?resource_6), 'urn'), BNODE(str(?resource_6)), ?resource_6) as ?resource_6_n) ?resource_6 sh:languageIn ?resource_6_7 . bind(IF(STRSTARTS(str(?resource_6_7), 'urn'), BNODE(str(?resource_6_7)), ?resource_6_7) as ?resource_6_7_n) } } ```
Export Dimension Terms ```sparql PREFIX rdf: PREFIX schema: CONSTRUCT { ?resource ?p ?o } FROM WHERE { ?resource rdf:type . ?resource ?p ?o } ```
Export Hierarchies ```sparql PREFIX rdf: PREFIX schema: PREFIX meta: PREFIX sh: CONSTRUCT { ?resource rdf:type . ?resource rdf:type ?resource_0_n . ?resource schema:name ?resource_1_n . ?resource ?resource_2_n . ?resource meta:hierarchyRoot ?resource_3_n . ?resource_4_i_n meta:nextInHierarchy ?resource_4_n . ?resource_4_n rdf:type ?resource_4_0_n . ?resource_4_n schema:name ?resource_4_1_n . ?resource_4_n sh:targetClass ?resource_4_2_n . ?resource_4_n sh:path ?resource_4_3_n . ?resource_4_3_n sh:inversePath ?resource_4_3_0_n . } FROM WHERE { ?resource rdf:type . { ?resource rdf:type ?resource_0 . bind(IF(STRSTARTS(str(?resource_0), 'urn'), BNODE(str(?resource_0)), ?resource_0) as ?resource_0_n) } UNION { ?resource schema:name ?resource_1 . bind(IF(STRSTARTS(str(?resource_1), 'urn'), BNODE(str(?resource_1)), ?resource_1) as ?resource_1_n) } UNION { ?resource ?resource_2 . bind(IF(STRSTARTS(str(?resource_2), 'urn'), BNODE(str(?resource_2)), ?resource_2) as ?resource_2_n) } UNION { ?resource meta:hierarchyRoot ?resource_3 . bind(IF(STRSTARTS(str(?resource_3), 'urn'), BNODE(str(?resource_3)), ?resource_3) as ?resource_3_n) } UNION { ?resource meta:nextInHierarchy* ?resource_4_i . ?resource_4_i meta:nextInHierarchy ?resource_4 . bind(IF(STRSTARTS(str(?resource_4_i), 'urn'), BNODE(str(?resource_4_i)), ?resource_4_i) as ?resource_4_i_n) bind(IF(STRSTARTS(str(?resource_4), 'urn'), BNODE(str(?resource_4)), ?resource_4) as ?resource_4_n) } UNION { ?resource meta:nextInHierarchy* ?resource_4_i . ?resource_4_i meta:nextInHierarchy ?resource_4 . ?resource_4 rdf:type ?resource_4_0 . bind(IF(STRSTARTS(str(?resource_4), 'urn'), BNODE(str(?resource_4)), ?resource_4) as ?resource_4_n) bind(IF(STRSTARTS(str(?resource_4_0), 'urn'), BNODE(str(?resource_4_0)), ?resource_4_0) as ?resource_4_0_n) } UNION { ?resource meta:nextInHierarchy* ?resource_4_i . ?resource_4_i meta:nextInHierarchy ?resource_4 . ?resource_4 schema:name ?resource_4_1 . bind(IF(STRSTARTS(str(?resource_4), 'urn'), BNODE(str(?resource_4)), ?resource_4) as ?resource_4_n) bind(IF(STRSTARTS(str(?resource_4_1), 'urn'), BNODE(str(?resource_4_1)), ?resource_4_1) as ?resource_4_1_n) } UNION { ?resource meta:nextInHierarchy* ?resource_4_i . ?resource_4_i meta:nextInHierarchy ?resource_4 . ?resource_4 sh:targetClass ?resource_4_2 . bind(IF(STRSTARTS(str(?resource_4), 'urn'), BNODE(str(?resource_4)), ?resource_4) as ?resource_4_n) bind(IF(STRSTARTS(str(?resource_4_2), 'urn'), BNODE(str(?resource_4_2)), ?resource_4_2) as ?resource_4_2_n) } UNION { ?resource meta:nextInHierarchy* ?resource_4_i . ?resource_4_i meta:nextInHierarchy ?resource_4 . ?resource_4 sh:path ?resource_4_3 . bind(IF(STRSTARTS(str(?resource_4), 'urn'), BNODE(str(?resource_4)), ?resource_4) as ?resource_4_n) bind(IF(STRSTARTS(str(?resource_4_3), 'urn'), BNODE(str(?resource_4_3)), ?resource_4_3) as ?resource_4_3_n) } UNION { ?resource meta:nextInHierarchy* ?resource_4_i . ?resource_4_i meta:nextInHierarchy ?resource_4 . ?resource_4 sh:path ?resource_4_3 . ?resource_4_3 sh:inversePath ?resource_4_3_0 . bind(IF(STRSTARTS(str(?resource_4_3), 'urn'), BNODE(str(?resource_4_3)), ?resource_4_3) as ?resource_4_3_n) bind(IF(STRSTARTS(str(?resource_4_3_0), 'urn'), BNODE(str(?resource_4_3_0)), ?resource_4_3_0) as ?resource_4_3_0_n) } } ```

Then clear the graph <https://lindas.admin.ch/cube/dimension> and upload the exported data to it.

Finally, restart the app

changeset-bot[bot] commented 1 year ago

🦋 Changeset detected

Latest commit: 810440614c5989d099c9341b61baf762b1e8e29e

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages | Name | Type | | ----------------------------------- | ----- | | @cube-creator/shared-dimensions-api | Major | | @cube-creator/core-api | Patch |

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

codecov-commenter commented 1 year ago

Codecov Report

Merging #1357 (8104406) into master (4325eba) will decrease coverage by 61.72%. The diff coverage is 5.52%.

@@             Coverage Diff             @@
##           master    #1357       +/-   ##
===========================================
- Coverage   81.04%   19.32%   -61.73%     
===========================================
  Files         195      197        +2     
  Lines       13506    13379      -127     
  Branches      754      105      -649     
===========================================
- Hits        10946     2585     -8361     
- Misses       2552    10794     +8242     
+ Partials        8        0        -8     
Impacted Files Coverage Δ
...s/shared-dimensions/lib/domain/shared-dimension.ts 0.00% <0.00%> (-92.15%) :arrow_down:
...d-dimensions/lib/domain/shared-dimension/import.ts 0.00% <0.00%> (-97.98%) :arrow_down:
...-dimensions/lib/domain/shared-dimension/queries.ts 0.00% <0.00%> (ø)
apis/shared-dimensions/lib/loader.ts 0.00% <0.00%> (-84.06%) :arrow_down:
apis/shared-dimensions/lib/rewrite.ts 0.00% <0.00%> (-86.37%) :arrow_down:
apis/shared-dimensions/lib/store.ts 0.00% <0.00%> (-93.59%) :arrow_down:
apis/shared-dimensions/lib/store/index.ts 0.00% <0.00%> (ø)
apis/shared-dimensions/lib/store/shapes.ts 0.00% <0.00%> (ø)
packages/testing/lib/seedData.ts 55.68% <39.13%> (-41.29%) :arrow_down:
packages/core/namespace.ts 100.00% <100.00%> (ø)
... and 144 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.