Swirrl / cubiql

CubiQL: A GraphQL service for querying multidimensional Linked Data Cubes
Eclipse Public License 1.0
41 stars 2 forks source link

MI wave buoy summary datacube - Cubiql exception #114

Open robthomas-marine opened 6 years ago

robthomas-marine commented 6 years ago

Hi,

At the Marine Institute we have generated daily summary data for the wave buoy network and converted the data to a DataCube available from an (at present internal) Virtuoso SPARQL endpoint. We hope to have an externally accessible endpoint in place later this week.

I have followed the guidelines to create a variant of a DataCube which meets the criteria for Cubiql (no multi-measures, rdfs:label not skos:prefLabel etc) but when I try to run Cubiql (0.2.0) against the endpoint I get the following error message (reformatted to make more readable):

Exception in thread "main" clojure.lang. ExceptionInfo: Argument measure' of fielddataset_irish_wave_buoy_network_daily_summary_statistics_observations_aggregations/max' references unknown type `dataset_irish_wave_buoy_network_daily_summary_statistics_aggregation_measures'. {:field-name :dataset_irish_wave_buoy_network_daily_summary_statistics_observations_aggregations/max, :arg-name :measure, :schema-types { :scalar [:Boolean :DateTime :Float :ID :Int :SparqlCursor :String :uri], :object [:MutationRoot :QueryRoot :SubscriptionRoot :dataset :dataset_irish_wave_buoy_network_daily_summary_statistics :dataset_irish_wave_buoy_network_daily_summary_statistics_observations :dataset_irish_wave_buoy_network_daily_summary_statistics_observations_aggregations :dataset_irish_wave_buoy_network_daily_summary_statistics_observations_page :dataset_irish_wave_buoy_network_daily_summary_statistics_observations_page_observations :dim :enum_dim_value :measure :ref_area :ref_period :unmapped_dim_value], :union [:dim_value], :input-object [ :dataset_irish_wave_buoy_network_daily_summary_statistics_observations_dimensions :dataset_irish_wave_buoy_network_daily_summary_statistics_observations_order_spec :filter :page_selector :ref_period_filter], :interface [:dataset_meta :resource], :enum [:sort_direction]}}

My interpretation is something in the DataCube is not what Cubiql is expecting. Please could someone translate and explain what is causing the error? Then I'll correct the local DataCube.

Cheers Rob

zeginis commented 6 years ago

@robthomas-marine it seems to be an error with the cube measures. CubiQL requires to use qb:measureType even if there is only one measure. Are your data compatible with this?

Can you share your data to have a look and see what is the problem?

robthomas-marine commented 6 years ago

@zeginis example of an observation in turtle:

eg:IWaveBN_Daily_o349489 a qb:Observation; qb:dataSet eg:IWaveBN_Daily ; eg:station_id mi-vcb:westwave_MK4 ; eg:statistic mi-vcb:daily_max ; eg:Date '2017-11-21'^^xsd:date ; eg:Hmax 0.0 ; qb:measureType eg:Hmax .

I have uploaded the full set of turtle files to the folder set up by NUIG.

zeginis commented 6 years ago

@robthomas-marine I have checked the data you uploaded. There are some issues that need to be fixed to make the data compatible with CubiQL:

We are currently working on some ASK queries to test the compatibility of a data cube with Cubiql #127.

robthomas-marine commented 6 years ago

@zeginis I have updated the RDF following the four points above. I'll upload the files to the shared folder.

Our endpoint is still internal but hope we will have something to query external next week.

When I run CubiQL against the internal endpoint graph I'm still getting the following error:

C:\Users\rthomas\Downloads>java -jar graphql-qb-0.2.0-standalone.jar --port 9000 --endpoint http://virtuoso.dm.marine.ie/sparql?default-graph-uri=wave-cubiql Exception in thread "main" clojure.lang.ExceptionInfo: Argument measure' of fielddataset_irish_wave_buoy_network_daily_summary_statistics_observations_aggregations/max' references unknown type `dataset_irish_wave_buoy_network_daily_summary_statistics_aggregation_measures'. {:field-name :dataset_irish_wave_buoy_network_daily_summary_statistics_observations_aggregations/max, :arg-name :measure, :schema-types {:scalar [:Boolean :DateTime :Float :ID :Int :SparqlCursor :String :uri], :object [:MutationRoot :QueryRoot :SubscriptionRoot :dataset :dataset_irish_wave_buoy_network_daily_summary_statistics :dataset_irish_wave_buoy_network_daily_summary_statistics_observations :dataset_irish_wave_buoy_network_daily_summary_statistics_observations_aggregations :dataset_irish_wave_buoy_network_daily_summary_statistics_observations_page :dataset_irish_wave_buoy_network_daily_summary_statistics_observations_page_observations :dim :enum_dim_value :measure :ref_area :ref_period :unmapped_dim_value], :union [:dim_value], :input-object [:dataset_irish_wave_buoy_network_daily_summary_statistics_observations_dimensions :dataset_irish_wave_buoy_network_daily_summary_statistics_observations_order_spec :filter :page_selector :ref_period_filter], :interface [:dataset_meta :resource], :enum [:sort_direction]}} at clojure.core$ex_info.invokeStatic(core.clj:4739) at clojure.core$ex_info.invoke(core.clj:4739) at com.walmartlabs.lacinia.schema$verify_fields_and_args.invokeStatic(schema.clj:915) at com.walmartlabs.lacinia.schema$verify_fields_and_args.invoke(schema.clj:896) at com.walmartlabs.lacinia.schema$prepare_and_validate_object.invokeStatic(schema.clj:951) at com.walmartlabs.lacinia.schema$prepare_and_validate_object.invoke(schema.clj:949) at com.walmartlabs.lacinia.schema$prepare_and_validate_objects$fn__3206.invoke(schema.clj:1013) at com.walmartlabs.lacinia.schema$map_types$fn2697.invoke(schema.clj:70) at clojure.lang.PersistentHashMap$NodeSeq.kvreduce(PersistentHashMap.java:1303) at clojure.lang.PersistentHashMap$BitmapIndexedNode.kvreduce(PersistentHashMap.java:796) at clojure.lang.PersistentHashMap$NodeSeq.kvreduce(PersistentHashMap.java:1308) at clojure.lang.PersistentHashMap$BitmapIndexedNode.kvreduce(PersistentHashMap.java:796) at clojure.lang.PersistentHashMap$ArrayNode.kvreduce(PersistentHashMap.java:464) at clojure.lang.PersistentHashMap.kvreduce(PersistentHashMap.java:236) at clojure.core$fn8080.invokeStatic(core.clj:6765) at clojure.core$fn8080.invoke(core.clj:6750) at clojure.core.protocols$fn7860$G78557869.invoke(protocols.clj:175) at clojure.core$reduce_kv.invokeStatic(core.clj:6776) at clojure.core$reduce_kv.invoke(core.clj:6767) at com.walmartlabs.lacinia.schema$map_types.invokeStatic(schema.clj:68) at com.walmartlabs.lacinia.schema$map_types.invoke(schema.clj:64) at com.walmartlabs.lacinia.schema$prepare_and_validate_objects.invokeStatic(schema.clj:1012) at com.walmartlabs.lacinia.schema$prepare_and_validate_objects.invoke(schema.clj:1008) at com.walmartlabs.lacinia.schema$construct_compiled_schema.invokeStatic(schema.clj:1055) at com.walmartlabs.lacinia.schema$construct_compiled_schema.invoke(schema.clj:1021) at com.walmartlabs.lacinia.schema$compile.invokeStatic(schema.clj:1105) at com.walmartlabs.lacinia.schema$compile.invoke(schema.clj:1081) at clojure.lang.AFn.applyToHelper(AFn.java:156) at clojure.lang.AFn.applyTo(AFn.java:144) at clojure.spec.test.alpha$spec_checking_fn$fn2943.doInvoke(alpha.clj:141) at clojure.lang.RestFn.invoke(RestFn.java:421) at com.walmartlabs.lacinia.schema$compile.invokeStatic(schema.clj:1099) at com.walmartlabs.lacinia.schema$compile.invoke(schema.clj:1081) at clojure.lang.AFn.applyToHelper(AFn.java:154) at clojure.lang.AFn.applyTo(AFn.java:144) at clojure.spec.test.alpha$spec_checking_fn$fn2943.doInvoke(alpha.clj:141) at clojure.lang.RestFn.invoke(RestFn.java:408) at graphql_qb.core$build_schema_context.invokeStatic(core.clj:152) at graphql_qb.core$build_schema_context.invoke(core.clj:147) at graphql_qb.server$create_server.invokeStatic(server.clj:16) at graphql_qb.server$create_server.invoke(server.clj:13) at graphql_qb.server$start_server.invokeStatic(server.clj:27) at graphql_qb.server$start_server.invoke(server.clj:26) at graphql_qb.main$_main.invokeStatic(main.clj:45) at graphql_qb.main$_main.doInvoke(main.clj:36) at clojure.lang.RestFn.applyTo(RestFn.java:137) at graphql_qb.main.main(Unknown Source)

arekstasiewicz commented 6 years ago

We were able to run CubiQL on top of the data, but getting the following error after initial run:

Error: dataset_irish_wave_buoy_network_daily_summary_statistics_observations_dimensions fields must be an object with field names as keys or a function which returns such an object. at invariant (http://localhost:9000/graphiql.js:24809:11) at GraphQLInputObjectType._defineFieldMap (http://localhost:9000/graphiql.js:28409:29) at GraphQLInputObjectType.getFields (http://localhost:9000/graphiql.js:28400:49) at typeMapReducer (http://localhost:9000/graphiql.js:29767:26) at Array.reduce (<anonymous>) at http://localhost:9000/graphiql.js:29760:36 at Array.forEach (<anonymous>) at typeMapReducer (http://localhost:9000/graphiql.js:29753:27) at http://localhost:9000/graphiql.js:29762:20 at Array.forEach (<anonymous>)

screen shot 2018-08-24 at 15 59 08

Would you be able to point which elements of dimension definition are not compatible with CubiQL? (Code List part? Components Specification? Dimension Property?)

Results:

screen shot 2018-08-24 at 16 04 13
arekstasiewicz commented 6 years ago

update: looks like usage of ComponentSpecification / ComponentProperty needs to be documented in CubiQL, we managed to adjust it in test data

zeginis commented 6 years ago

@arekstasiewicz what is the status of this issue? Does MI data work properly with CubiQL?

robthomas-marine commented 6 years ago

@zeginis On Friday I was able to get the MI SPARQL endpoint data working with: java -jar graphql-qb-0.2.0-standalone.jar --port 9000 --endpoint https://linked.marine.ie/sparql

A basic dataset query (query={datasets{uri%20title%20description%20}%20}) is working. Where can I get some additional queries to further test CubiQL functionality?

zeginis commented 6 years ago

@robthomas-marine at the readme file you can find many example queries. Open the links and copy the queries to try them locally.

How did you managed to run CubiQL at your endpoit? What changes or configuration did you used? It will of great help to have your feedback on the issues you encountered and how you fixed them.

Note that there is a minor change at the CubiQL schema at the latest versions

It uses ... page {observation ... instead of ... page {observations ...

e.g.

{cubiql{
  dataset_earnings {
    title
    description
    observation {    
      page {
        observation {
          gender
          median
          uri          
 }  }  }  } } }
robthomas-marine commented 6 years ago

@zeginis Thanks. I didn't change anything in the CubiQL configuration. From trial and error in the generation of the DataCube and in order to get the Data Cube integrity checks from the standard to work the following resulted in CubiQL running:

qb:measure and qb:dimension had to be replaced with qb:componentProperty in the Data Structure Definition:

eg:dsd-IWaveBN_Daily a qb:DataStructureDefinition;

The dimensions

qb:component [ a qb:ComponentSpecification ; qb:componentProperty eg:station_id ; qb:order 1 ] ,
             [ a qb:ComponentSpecification ; qb:componentProperty eg:statistic ; qb:order 2 ] ,
             [ a qb:ComponentSpecification ; qb:componentProperty eg:Date ; qb:order 3 ] ,
             [ a qb:ComponentSpecification ; qb:componentProperty qb:measureType ; qb:order 4 ] ;

# The measure(s)
qb:component [ a qb:ComponentSpecification ; qb:componentProperty eg:Hmax] ;
qb:component [ a qb:ComponentSpecification ; qb:componentProperty eg:MeanCurDirTo] ;
qb:component [ a qb:ComponentSpecification ; qb:componentProperty eg:MeanCurSpeed] ;
qb:component [ a qb:ComponentSpecification ; qb:componentProperty eg:PeakDirection] ;
qb:component [ a qb:ComponentSpecification ; qb:componentProperty eg:PeakPeriod] ;
qb:component [ a qb:ComponentSpecification ; qb:componentProperty eg:SeaTemperature] ;
qb:component [ a qb:ComponentSpecification ; qb:componentProperty eg:SignificantWaveHeight] ;
qb:component [ a qb:ComponentSpecification ; qb:componentProperty eg:THmax] ;
qb:component [ a qb:ComponentSpecification ; qb:componentProperty eg:UpcrossPeriod] ;

Each qb:ComponentProperty is then typed as qb:MeasureProperty or qb_DimensionProperty as appropriate:

eg:station_id a rdf:Property , qb:ComponentProperty , qb:DimensionProperty , qb:CodedProperty ; rdfs:label "station_id" ; rdfs:subPropertyOf sdmx-dimension:refArea ; rdfs:range mi-vcb:station_id ; qb:concept sdmx-concept:refArea ; qb:codeList mi-vcb:station_id ; .

eg:statistic a rdf:Property , qb:ComponentProperty , qb:DimensionProperty , qb:CodedProperty ; rdfs:label "statistic"; rdfs:subPropertyOf sdmx-dimension:statConcDef; rdfs:range mi-vcb:statistic ; qb:concept sdmx-concept:statConcDef ; qb:codeList mi-vcb:statistic ; .

eg:Date a rdf:Property , qb:ComponentProperty , qb:DimensionProperty , qb:CodedProperty ; rdfs:label "Date"; rdfs:subPropertyOf sdmx-dimension:refPeriod; qb:concept sdmx-concept:refPeriod ; rdfs:range mi-vcb:dates ; qb:codeList mi-vcb:date ; .

qb:measureType a rdf:Property , qb:ComponentProperty , qb:DimensionProperty , qb:CodedProperty ; rdfs:label "measurementType"; rdfs:range mi-vcb:measures ; qb:codeList mi-vcb:measures ; .

eg:Hmax a rdf:Property , qb:ComponentProperty , qb:MeasureProperty , skos:Concept ; rdfs:label "Hmax"; rdfs:seeAlso http://vocab.nerc.ac.uk/collection/P07/current/JNQS0CMX/ ; rdfs:subPropertyOf sdmx-measure:obsValue; sdmx-attribute:unitMeasure http://vocab.nerc.ac.uk/collection/P06/current/ULCM/ ; skos:inScheme mi-vcb:measures ; rdfs:range xsd:decimal ;

However the querying for measures and dimensions returns no results or an internal error respectively:

I'm trying the following but with little success: http://localhost:9000/graphql?query={%20datasets%20{%20uri%20title%20description%20measures{uri}}%20}

returns an empty list of measures:

{"data":{"datasets":[{"uri":"http://data.marine.ie/datacube#IWaveBN_Daily","title":"Irish Wave Buoy Network - Daily Summary Statistics","description":"Summary statistics (daily mean, standard deviation, minimun and maximum) by day of year for the Irish Wave Buoy network measurements.","measures":[]}]}}

http://localhost:9000/graphql?query={%20datasets%20{%20uri%20title%20description%20dimensions{uri}}%20}

returns an error message:

Internal server error: exception

zeginis commented 6 years ago

@robthomas-marine CubiQL requires qb:dimension and qb:measure for the dimensions and measures respectively. Using the qb:componentProperty "hides" them from CubiQL.

robthomas-marine commented 6 years ago

@zeginis I tried different combinations yesterday and when adding qb:dimension and qb:measure in place of qb:componentProperty CubiQL failed to run.

I'm meeting NUIG this afternoon and we'll investigate further.

zeginis commented 6 years ago

@robthomas-marine most probably CubiQL fails when using qb:dimension and qb:measure because there is some incompatibility at the dimensions and/or measures. By usingqb:componentProperty we are just "hiding" the problem.