Swirrl / cubiql

CubiQL: A GraphQL service for querying multidimensional Linked Data Cubes
Eclipse Public License 1.0
41 stars 2 forks source link

CSO data compatibility #107

Open arekstasiewicz opened 6 years ago

arekstasiewicz commented 6 years ago

I was trying to use CubiQL with CSO endpoint:

java -jar graphql-qb-0.2.1-SNAPSHOT-standalone.jar --port 9000 --endpoint http://data.cso.ie/sparql

The following error is displayed:

screen shot 2018-07-19 at 15 07 46

@zeginis Could you please advise how complicated it would be to make the data work with the API?

zeginis commented 6 years ago

@arekstasiewicz there requirements for data to work with CubiQL are documented at: #92.

Most of these will be fixed. There are 2 open issues:

As a first step you can add (qb:codeList) the codelist that are defined at CSO to the dimensions. I have checked them and all the values they contain are used at every cube, so they are compliant with the above requirement (The only exception is the age dimension)

mohadelrezk commented 5 years ago

Hi @zeginis I am attaching a sample cube I created to align with CubiQL cso_noENtag.ttl, but still throwing error at creation time. I have addressed the following data restriction issues: 1- qb:codeList 2- qb:measureType 3- Language tag 4- multiple publishers

what is it still missing? error: exception in thread "main" clojure.lang.ExceptionInfo: Call to #'com.walmartlabs.lacinia.schema/compile did not conform to spec: In: [0 :objects :dataset_cso 1 :description] val: #grafter.rdf.protocols.LangString{:string "CSO", :lang :en} fails spec: :com.walmartlabs.lacinia.schema/description at: [:args :schema :objects 1 :description] predicate: string? {:clojure.spec.alpha/problems ({:path [:args :schema :objects 1 :description], :pred clojure.core/string?, :val #grafter.rdf.protocols.LangString{:string "CSO", :lang :en}, :via [:com.walmartlabs.lacinia.schema/schema-object :com.walmartlabs.lacinia.schema/schema-object :com.walmartlabs.lacinia.schema/objects :com.walmartlabs.lacinia.schema/object :com.walmartlabs.lacinia.schema/description], :in [0 :objects :dataset_cso 1 :description]}), :clojure.spec.alpha/spec #object[clojure.spec.alpha$regex_spec_impl$reify__2436 0x35329a05 "clojure.spec.alpha$regex_spec_impl$reify__2436@35329a05"], :clojure.spec.alpha/value ({:objects {:ref_area {:implements [:resource], :fields {:uri {:type :uri, :description "URI of the reference area"}, :label {:type String, :description "Label for the reference area"}}}, :dataset_cso_observations {:fields {:sparql {:type String, :description "SPARQL query used to retrieve matching observations.", :resolve #object[graphql_qb.resolvers$resolve_observations_sparql_query 0x6f1d799 "graphql_qb.resolvers$resolve_observations_sparql_query@6f1d799"]}, :page {:type :dataset_cso_observations_page, :args {:after {:type :SparqlCursor}, :first {:type Int}}, :description "Page of results to retrieve.", :resolve #object[graphql_qb.schema$wrap_observations_mapping$fn__5325 0xa120b9 "graphql_qb.schema$wrap_observations_mapping$fn__5325@a120b9"]}, :total_matches {:type Int}, :aggregations {:type :dataset_cso_observations_aggregations}}}, :ref_period {:fields {:uri {:type :uri, :description "URI of the reference period"}, :label {:type String, :description "Label for the reference period"}, :start {:type :DateTime, :description "Start time for the period"}, :end {:type :DateTime, :description "End time for the period"}}}, :dim {:fields {:uri {:type :uri, :description "URI of the dimension"}, :values {:type (list :dim_value), :description "Code list of values for the dimension"}, :enum_name {:type String, :description "Name of the corresponding enum value"}}}, :dataset_cso_observations_page {:fields {:next_page {:type :SparqlCursor, :description "Cursor to the next page of results"}, :count {:type Int}, :observations {:type (list :dataset_cso_observations_page_observations), :description "List of observations on this page"}}}, :unmapped_dim_value {:implements [:resource], :fields {:uri {:type :uri, :description "URI of the dimension value"}, :label {:type String, :description "Label for the dimension value"}}}, :measure {:fields {:uri {:type :uri, :description "URI of the measure"}, :label {:type String, :description "Label for the measure"}, :enum_name {:type String, :description "Name of the corresponding enum value"}}}, :enum_dim_value {:implements [:resource], :fields {:uri {:type :uri, :description "URI of the dimension value"}, :label {:type String, :description "Label for the dimension value"}, :enum_name {:type String, :description "Name of the corresponding enum value"}}}, :dataset_cso {:implements [:dataset_meta], :fields {:description {:type String, :description "Dataset description"}, :schema {:type String, :description "Name of the GraphQL query root field corresponding to this dataset"}, :publisher {:type :uri, :description "URI of the publisher of the dataset"}, :observations {:type :dataset_cso_observations, :args {:dimensions {:type :dataset_cso_observations_dimensions}, :order {:type (list :dataset_cso_dimension_measures)}, :order_spec {:type :dataset_cso_observations_order_spec}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0x21a9f95b "graphql_qb.schema$argument_mapping_resolver$fn__5311@21a9f95b"]}, :modified {:type :DateTime, :description "When the dataset was last modified"}, :dimensions {:type (list :dim), :resolve #object[graphql_qb.schema$get_query_schema_model$fn__5359 0x69069866 "graphql_qb.schema$get_query_schema_model$fn__5359@69069866"], :description "Dimensions within the dataset"}, :title {:type String, :description "Dataset title"}, :licence {:type :uri, :description "URI of the licence the dataset is published under"}, :measures {:type (list :measure), :description "Measure types within the dataset"}, :issued {:type :DateTime, :description "When the dataset was issued"}, :uri {:type :uri, :description "Dataset URI"}}, :description #grafter.rdf.protocols.LangString{:string "CSO", :lang :en}}, :dataset_cso_observations_aggregations {:fields {:max {:type Float, :args {:measure {:type (non-null :dataset_cso_aggregation_measures), :description "The measure to aggregate"}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0xac417a2 "graphql_qb.schema$argument_mapping_resolver$fn__5311@ac417a2"]}, :min {:type Float, :args {:measure {:type (non-null :dataset_cso_aggregation_measures), :description "The measure to aggregate"}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0x64c95480 "graphql_qb.schema$argument_mapping_resolver$fn__5311@64c95480"]}, :sum {:type Float, :args {:measure {:type (non-null :dataset_cso_aggregation_measures), :description "The measure to aggregate"}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0x69499c6f "graphql_qb.schema$argument_mapping_resolver$fn__5311@69499c6f"]}, :average {:type Float, :args {:measure {:type (non-null :dataset_cso_aggregation_measures), :description "The measure to aggregate"}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0x3451fc88 "graphql_qb.schema$argument_mapping_resolver$fn__5311@3451fc88"]}}}, :dataset {:implements [:dataset_meta], :fields {:description {:type String, :description "Dataset description"}, :schema {:type String, :description "Name of the GraphQL query root field corresponding to this dataset"}, :publisher {:type :uri, :description "URI of the publisher of the dataset"}, :modified {:type :DateTime, :description "When the dataset was last modified"}, :dimensions {:type (list :dim), :resolve #object[graphql_qb.resolvers$dataset_dimensions_resolver$fn__5213 0x3041beb3 "graphql_qb.resolvers$dataset_dimensions_resolver$fn__5213@3041beb3"], :description "Dimensions within the dataset"}, :title {:type String, :description "Dataset title"}, :licence {:type :uri, :description "URI of the licence the dataset is published under"}, :measures {:type (list :measure), :resolve #object[graphql_qb.resolvers$dataset_measures_resolver$fn__5205 0x2e40fdbd "graphql_qb.resolvers$dataset_measures_resolver$fn__5205@2e40fdbd"], :description "Measure types within the dataset"}, :issued {:type :DateTime, :description "When the dataset was issued"}, :uri {:type :uri, :description "Dataset URI"}}}, :dataset_cso_observations_page_observations {:fields {:uri {:type :uri}, :value {:type String}}}}, :interfaces {:dataset_meta {:description "Fields common to generic and specific dataset schemas", :fields {:uri {:type :uri, :description "Dataset URI"}, :title {:type String, :description "Dataset title"}, :description {:type String, :description "Dataset description"}, :schema {:type String, :description "Name of the GraphQL query root field corresponding to this dataset"}, :dimensions {:type (list :dim), :description "Dimensions within the dataset"}, :measures {:type (list :measure), :description "Measure types within the dataset"}}}, :resource {:description "Resource with a URI and optional label", :fields {:uri {:type :uri, :description "URI of the resource"}, :label {:type String, :description "Optional label"}}}}, :enums {:sort_direction {:description "Which direction to sort a dimension or measure in", :values [:ASC :DESC]}}, :unions {:dim_value {:members [:enum_dim_value :unmapped_dim_value]}}, :input-objects {:filter {:fields {:or {:type (list :uri), :description "List of URIs for which at least one must be contained within matching datasets."}, :and {:type (list :uri), :description "List of URIs which must all be contained within matching datasets."}}}, :ref_period_filter {:fields {:uri {:type :uri, :description "URI of the reference period"}, :starts_before {:type :DateTime, :description "Latest start time for the reference period"}, :starts_after {:type :DateTime, :description "Earliest start time for the reference period"}, :ends_before {:type :DateTime, :description "Latest end time for the reference period"}, :ends_after {:type :DateTime, :description "Earliest end time for the reference period"}}}, :page_selector {:fields {:first {:type Int, :description "Number of results to retrive."}, :after {:type :SparqlCursor, :description "Cursor to the start of the results page"}}}, :dataset_cso_observations_dimensions {:fields {}}, :dataset_cso_observations_order_spec {:fields {:value {:type :sort_direction}}}}, :queries {:datasets {:type (list :dataset), :resolve #object[graphql_qb.resolvers$resolve_datasets 0x19647566 "graphql_qb.resolvers$resolve_datasets@19647566"], :args {:dimensions {:type :filter}, :uri {:type :uri}}}, :dataset_cso {:type :dataset_cso, :resolve #object[graphql_qb.resolvers$wrap_post_resolver$fn__5140 0x527d48db "graphql_qb.resolvers$wrap_post_resolver$fn__5140@527d48db"]}}, :scalars {:SparqlCursor {:parse #object[clojure.spec.alpha$spec_impl$reify__1987 0x2335aef2 "clojure.spec.alpha$spec_impl$reify__1987@2335aef2"], :serialize #object[clojure.spec.alpha$spec_impl$reify__1987 0x17003497 "clojure.spec.alpha$spec_impl$reify__1987@17003497"]}, :uri {:parse #object[clojure.spec.alpha$spec_impl$reify__1987 0x2f038d3c "clojure.spec.alpha$spec_impl$reify__1987@2f038d3c"], :serialize #object[clojure.spec.alpha$spec_impl$reify__1987 0x376498da "clojure.spec.alpha$spec_impl$reify__1987@376498da"]}, :DateTime {:parse #object[clojure.spec.alpha$spec_impl$reify__1987 0x39a8e2fa "clojure.spec.alpha$spec_impl$reify__1987@39a8e2fa"], :serialize #object[clojure.spec.alpha$spec_impl$reify__1987 0x2f9addd4 "clojure.spec.alpha$spec_impl$reify__1987@2f9addd4"]}}}), :clojure.spec.alpha/args ({:objects {:ref_area {:implements [:resource], :fields {:uri {:type :uri, :description "URI of the reference area"}, :label {:type String, :description "Label for the reference area"}}}, :dataset_cso_observations {:fields {:sparql {:type String, :description "SPARQL query used to retrieve matching observations.", :resolve #object[graphql_qb.resolvers$resolve_observations_sparql_query 0x6f1d799 "graphql_qb.resolvers$resolve_observations_sparql_query@6f1d799"]}, :page {:type :dataset_cso_observations_page, :args {:after {:type :SparqlCursor}, :first {:type Int}}, :description "Page of results to retrieve.", :resolve #object[graphql_qb.schema$wrap_observations_mapping$fn__5325 0xa120b9 "graphql_qb.schema$wrap_observations_mapping$fn__5325@a120b9"]}, :total_matches {:type Int}, :aggregations {:type :dataset_cso_observations_aggregations}}}, :ref_period {:fields {:uri {:type :uri, :description "URI of the reference period"}, :label {:type String, :description "Label for the reference period"}, :start {:type :DateTime, :description "Start time for the period"}, :end {:type :DateTime, :description "End time for the period"}}}, :dim {:fields {:uri {:type :uri, :description "URI of the dimension"}, :values {:type (list :dim_value), :description "Code list of values for the dimension"}, :enum_name {:type String, :description "Name of the corresponding enum value"}}}, :dataset_cso_observations_page {:fields {:next_page {:type :SparqlCursor, :description "Cursor to the next page of results"}, :count {:type Int}, :observations {:type (list :dataset_cso_observations_page_observations), :description "List of observations on this page"}}}, :unmapped_dim_value {:implements [:resource], :fields {:uri {:type :uri, :description "URI of the dimension value"}, :label {:type String, :description "Label for the dimension value"}}}, :measure {:fields {:uri {:type :uri, :description "URI of the measure"}, :label {:type String, :description "Label for the measure"}, :enum_name {:type String, :description "Name of the corresponding enum value"}}}, :enum_dim_value {:implements [:resource], :fields {:uri {:type :uri, :description "URI of the dimension value"}, :label {:type String, :description "Label for the dimension value"}, :enum_name {:type String, :description "Name of the corresponding enum value"}}}, :dataset_cso {:implements [:dataset_meta], :fields {:description {:type String, :description "Dataset description"}, :schema {:type String, :description "Name of the GraphQL query root field corresponding to this dataset"}, :publisher {:type :uri, :description "URI of the publisher of the dataset"}, :observations {:type :dataset_cso_observations, :args {:dimensions {:type :dataset_cso_observations_dimensions}, :order {:type (list :dataset_cso_dimension_measures)}, :order_spec {:type :dataset_cso_observations_order_spec}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0x21a9f95b "graphql_qb.schema$argument_mapping_resolver$fn__5311@21a9f95b"]}, :modified {:type :DateTime, :description "When the dataset was last modified"}, :dimensions {:type (list :dim), :resolve #object[graphql_qb.schema$get_query_schema_model$fn__5359 0x69069866 "graphql_qb.schema$get_query_schema_model$fn__5359@69069866"], :description "Dimensions within the dataset"}, :title {:type String, :description "Dataset title"}, :licence {:type :uri, :description "URI of the licence the dataset is published under"}, :measures {:type (list :measure), :description "Measure types within the dataset"}, :issued {:type :DateTime, :description "When the dataset was issued"}, :uri {:type :uri, :description "Dataset URI"}}, :description #grafter.rdf.protocols.LangString{:string "CSO", :lang :en}}, :dataset_cso_observations_aggregations {:fields {:max {:type Float, :args {:measure {:type (non-null :dataset_cso_aggregation_measures), :description "The measure to aggregate"}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0xac417a2 "graphql_qb.schema$argument_mapping_resolver$fn__5311@ac417a2"]}, :min {:type Float, :args {:measure {:type (non-null :dataset_cso_aggregation_measures), :description "The measure to aggregate"}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0x64c95480 "graphql_qb.schema$argument_mapping_resolver$fn__5311@64c95480"]}, :sum {:type Float, :args {:measure {:type (non-null :dataset_cso_aggregation_measures), :description "The measure to aggregate"}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0x69499c6f "graphql_qb.schema$argument_mapping_resolver$fn__5311@69499c6f"]}, :average {:type Float, :args {:measure {:type (non-null :dataset_cso_aggregation_measures), :description "The measure to aggregate"}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0x3451fc88 "graphql_qb.schema$argument_mapping_resolver$fn__5311@3451fc88"]}}}, :dataset {:implements [:dataset_meta], :fields {:description {:type String, :description "Dataset description"}, :schema {:type String, :description "Name of the GraphQL query root field corresponding to this dataset"}, :publisher {:type :uri, :description "URI of the publisher of the dataset"}, :modified {:type :DateTime, :description "When the dataset was last modified"}, :dimensions {:type (list :dim), :resolve #object[graphql_qb.resolvers$dataset_dimensions_resolver$fn__5213 0x3041beb3 "graphql_qb.resolvers$dataset_dimensions_resolver$fn__5213@3041beb3"], :description "Dimensions within the dataset"}, :title {:type String, :description "Dataset title"}, :licence {:type :uri, :description "URI of the licence the dataset is published under"}, :measures {:type (list :measure), :resolve #object[graphql_qb.resolvers$dataset_measures_resolver$fn__5205 0x2e40fdbd "graphql_qb.resolvers$dataset_measures_resolver$fn__5205@2e40fdbd"], :description "Measure types within the dataset"}, :issued {:type :DateTime, :description "When the dataset was issued"}, :uri {:type :uri, :description "Dataset URI"}}}, :dataset_cso_observations_page_observations {:fields {:uri {:type :uri}, :value {:type String}}}}, :interfaces {:dataset_meta {:description "Fields common to generic and specific dataset schemas", :fields {:uri {:type :uri, :description "Dataset URI"}, :title {:type String, :description "Dataset title"}, :description {:type String, :description "Dataset description"}, :schema {:type String, :description "Name of the GraphQL query root field corresponding to this dataset"}, :dimensions {:type (list :dim), :description "Dimensions within the dataset"}, :measures {:type (list :measure), :description "Measure types within the dataset"}}}, :resource {:description "Resource with a URI and optional label", :fields {:uri {:type :uri, :description "URI of the resource"}, :label {:type String, :description "Optional label"}}}}, :enums {:sort_direction {:description "Which direction to sort a dimension or measure in", :values [:ASC :DESC]}}, :unions {:dim_value {:members [:enum_dim_value :unmapped_dim_value]}}, :input-objects {:filter {:fields {:or {:type (list :uri), :description "List of URIs for which at least one must be contained within matching datasets."}, :and {:type (list :uri), :description "List of URIs which must all be contained within matching datasets."}}}, :ref_period_filter {:fields {:uri {:type :uri, :description "URI of the reference period"}, :starts_before {:type :DateTime, :description "Latest start time for the reference period"}, :starts_after {:type :DateTime, :description "Earliest start time for the reference period"}, :ends_before {:type :DateTime, :description "Latest end time for the reference period"}, :ends_after {:type :DateTime, :description "Earliest end time for the reference period"}}}, :page_selector {:fields {:first {:type Int, :description "Number of results to retrive."}, :after {:type :SparqlCursor, :description "Cursor to the start of the results page"}}}, :dataset_cso_observations_dimensions {:fields {}}, :dataset_cso_observations_order_spec {:fields {:value {:type :sort_direction}}}}, :queries {:datasets {:type (list :dataset), :resolve #object[graphql_qb.resolvers$resolve_datasets 0x19647566 "graphql_qb.resolvers$resolve_datasets@19647566"], :args {:dimensions {:type :filter}, :uri {:type :uri}}}, :dataset_cso {:type :dataset_cso, :resolve #object[graphql_qb.resolvers$wrap_post_resolver$fn__5140 0x527d48db "graphql_qb.resolvers$wrap_post_resolver$fn__5140@527d48db"]}}, :scalars {:SparqlCursor {:parse #object[clojure.spec.alpha$spec_impl$reify__1987 0x2335aef2 "clojure.spec.alpha$spec_impl$reify__1987@2335aef2"], :serialize #object[clojure.spec.alpha$spec_impl$reify__1987 0x17003497 "clojure.spec.alpha$spec_impl$reify__1987@17003497"]}, :uri {:parse #object[clojure.spec.alpha$spec_impl$reify__1987 0x2f038d3c "clojure.spec.alpha$spec_impl$reify__1987@2f038d3c"], :serialize #object[clojure.spec.alpha$spec_impl$reify__1987 0x376498da "clojure.spec.alpha$spec_impl$reify__1987@376498da"]}, :DateTime {:parse #object[clojure.spec.alpha$spec_impl$reify__1987 0x39a8e2fa "clojure.spec.alpha$spec_impl$reify__1987@39a8e2fa"], :serialize #object[clojure.spec.alpha$spec_impl$reify__1987 0x2f9addd4 "clojure.spec.alpha$spec_impl$reify__1987@2f9addd4"]}}}), :clojure.spec.alpha/failure :instrument, :clojure.spec.test.alpha/caller {:file "core.clj", :line 157, :var-scope graphql-qb.core/build-schema-context}}

zeginis commented 5 years ago

#grafter.rdf.protocols.LangString{:string "CSO", :lang :en}

@mohadelrezk this line says that there is an en language tag at the label "CSO". I checked at the data but the "CSO" label does not have a language tag. Are you using just the file you send as input or something more?

The qb:codeList of the cube dimensions should be a skos:ConceptScheme that includes all the URIs that are used as values of the dimension. At the file you send I see you use: qb:codeList "<http://purl.org/linked-data/sdmx/2009/subject#>"

Additionaly it is preferable to use URIs instead of string for the values of the dimensions e.g. use http://reference.data.gov.uk/id/year/2016 instead of "2016"^^xsd:string

ogi:observations_009097c4-45ee-40d2-b405-b82de3963ab7 a qb:Observation ;
    qb:dataSet ogi:cso_ds ;
    qb:measureType ogi:Value ;
    ogi:CensusYear "2016"^^xsd:string ;
    ogi:Nationality "Not stated, including no nationality"^^xsd:string ;
    ogi:Sex "Male"^^xsd:string ;
    ogi:SingleYearofAge "63 years"^^xsd:string ;
    ogi:Statistic "Population Usually Resident and Present in the State 2011 to 2016 (Number)"^^xsd:string ;
    ogi:Value "268"^^xsd:string .