zazuko / cube-creator

A tool to create RDF cubes from CSV files
GNU Affero General Public License v3.0
11 stars 2 forks source link

mixed node kind in cube data #1469

Open giacomociti opened 8 months ago

giacomociti commented 8 months ago

Describe the bug

Affected functionalities (all that apply)

Relevant links

[query](https://s.zazuko.com/3pGR6yA) returning an example of inconsistent data (a dimension with constraint `sh:nodeKind sh:IRI` but having both IRI and literal values). **To Reproduce** Steps to reproduce the behavior: 1. Create a new cube from CSV 2. Apply transformation 3. Edit metadata linking to a shared dimension some values of a dimension (but not all of them) 4. Publish **Expected behavior**

The constraint for the dimension should have sh:nodeKind sh:IRIOrLiteral instead of sh:IRI.

Screenshots

Desktop (please complete the following information):

Additional context

There is a proposal of disallowing mixed node kinds. If applied, cube creator should prevent publishing invalid data.

tpluscode commented 2 months ago

I the correct node kind set when you transform again after editing the metadata?

giacomociti commented 2 months ago

by "editing the metadata" you mean for example linking every value to some shared dimension term? Because the node kind is set by the pipeline (<#toCubeShape>) based on the actual observations (and does not cover sh:IRIOrLiteral unlike the new implementation in barnard59). So the only chance to avoid the issue is to ensure all the values for a dimension have the same node kind (either IRI or Literal)

Rdataflow commented 2 months ago

@giacomociti WRT the spec at https://cube.link/#null-empty-values it would be the most obvious to reuse cube:Undefined in this case.

... so nodeKind would be consistently IRI :+1:

giacomociti commented 2 months ago

agreed, we should be using cube:Undefined.

There is an external application which may be affected by the change, so we'll probably evolve both the cube and the app in the future.

Rdataflow commented 2 months ago

@giacomociti to increase the reusability of cube:Undefined it might be good to establish labels using schema:name and maybe some other useful properties.

first brainstorming:

cube:Undefined  schema:name  "Undefined" , "Undefined"@en , "Unbestimmt"@de , "Indéfini"@fr , "Indefinito"@it , "Indefini"@rm .

possibly other attributes i.e. WDYT?

cube:Undefined  schema:identifier  ""^^cube:Undefined ;
    schema:position  ""^^cube:Undefined .