zazuko / cube-creator

A tool to create RDF cubes from CSV files
GNU Affero General Public License v3.0
12 stars 2 forks source link

Dealing with masked values #1370

Open tboeni opened 1 year ago

tboeni commented 1 year ago

Recently I am dealing with datasets which contains "masked" values in the measure dimension. Meaning that if a value is smaller than certain value "X" it is changed to "< X". When assingning a data type to the dimension, neither integer nor decimal is correct because of the masked values but assigning string obviously does not work for representing the measure dimension in Visualize. Is there currently some kind of workaround for this kind of datasets?

Here is an example of such a "masked" cube Arbeitsstaetten_masked

tpluscode commented 1 year ago

We currently have no way to handle such a case nor can I think of how this kind of dimension should be handled

For values of X, would you like to have them interpreted as numbers? If yes, what should be the interpretation of < X. They is interpreted as invalid numbers as expected...

Say we allow a dimension to have two types of values

<cube>
  cube:observationSet [
    cube:observation [ ex:value "X"^^xsd:int ] ;
    cube:observation [ ex:value "< X"^^xsd:string ] ;
  ] ;
.

Is that even allowed by cube spec? @l00mi @ktk How would that be supported in visualize? @ptbrowne

l00mi commented 1 year ago

For such cases the initial idea was to use annotations with empty values https://cube.link/#null-empty-values. This is still somewhere on the road-map for cube.link and also for the cube-creator (https://github.com/zazuko/cube-creator/wiki/Road-map#designer) . @tboeni or @dabo505 this should be tracked here: https://gitlab.ldbar.ch/bafu/umweltdatenkiosk-planning/-/issues/332, potentially to split it to an own requirement, "Annotations per Value".

Other use cases for "per-value" annotations in this issue: https://gitlab.ldbar.ch/bafu/umweltdatenkiosk-planning/-/issues/286