zazuko / query-rdf-data-cube

Explore or query RDF Data Cubes with a JavaScript API, without writing SPARQL.
https://zazuko.github.io/query-rdf-data-cube/
9 stars 2 forks source link

Ideas for new API #57

Open jstcki opened 4 years ago

jstcki commented 4 years ago

@vhf @BenjaminHofstetter @martinmaillard,

As discussed with @ktk, I'll share some ideas for what we as users would expect from the next version of this library or its successor 😄

Some context: for visualize.admin.ch we built a GraphQL API for the frontend, so we could a) ship less code to the browser by only using this library on the server-side and b) simplify the frontend code because we handle all data transformations also in the "backend of the frontend".

I'm not intending to replace GraphQL because it's very convenient for us but maybe the schema that we've created can serve as a first inspiration of how we are querying the data. You can play with it at https://visualize.admin.ch/api/graphql

As you'll see, we added a few things on top of this library, most notably:

  1. We lifted up all the extraMetadata into the DataCube type itself, so the data we're working with is less nested
  2. We have distinct types for the different kinds of Dimensions (temporal, nominal, ordinal)
  3. The most important thing: in the DataCube.observations query, the data is re-mapped to use the dimension IRIs as object keys again. This is important, as we need to reference these IRIs in the frontend again. I wrote about this in #51

There's more, and I can elaborate later, but this should be a good overview of where we currently are.

/cc @lucguillemot

jstcki commented 4 years ago

Here's an example (GraphQL) query for how we are using the observations. Note that we're using the dimension IRIs as keys on the observations. This allows us to map them back to labels, colors, etc. in our app.

curl 'https://visualize.admin.ch/api/graphql' -H 'Accept-Encoding: gzip, deflate, br' -H 'Content-Type: application/json' -H 'Accept: application/json' -H 'Connection: keep-alive' -H 'DNT: 1' -H 'Origin: https://visualize.admin.ch' --data-binary '{"query":"{\n  dataCubeByIri(\n    iri: \"http://environment.ld.admin.ch/foen/px/0703030000_122/dataset\"\n  ) {\n    title\n    observations {\n      data\n    }\n  }\n}\n"}' --compressed

This results in something like this:

{
  "data": {
    "dataCubeByIri": {
      "title": "Investitionen in den Forstbetrieben ab 50 ha Wald in Franken nach Jahr, Forstzone, Kanton und Variable",
      "observations": {
        "data": [
          {
            "http://environment.ld.admin.ch/foen/px/0703030000_122/dimension/1": "Mittelland",
            "http://environment.ld.admin.ch/foen/px/0703030000_122/dimension/2": "Schweiz",
            "http://environment.ld.admin.ch/foen/px/0703030000_122/dimension/0": "2008"
          },
          //...
        ]
      }
    }
  }
}

As just discussed, this is really the structure we prefer for observations (as "raw" and flat as possible).

We query cube metadata upfront and separately from observations, so there is really no need to map observations to the metadata structure.