New dataset: S1 Median electricity tariff per canton

cmdoret commented 1 year ago

Proposal to include dataset: Median electricity tariff per canton

Dataset properties

URL: https://energy.ld.admin.ch/elcom/electricityprice-canton
format: RDF
size: 1MB
dimensions: kanton, jahr, product, category
units: CHF cents
lang: de

Additional notes

I named this dataset S1 to differentiate it, since this is from lindas SPARQL endpoint.

The dataset lists detailed median energy price per canton over time from 2009 to 2023. The whole dataset is rendered as a visualization here: https://www.prix-electricite.elcom.admin.ch/?priceComponent=total

Questions

The dataset provides the detail of energy price (aid fee, grid usage, ...) as well as the total. Do we care about individual price components?
It has 2 products (standard and cheapest), do we want to: retain them as separate columns (total_price_standard, total_price_cheapest), or as an "energy_product" column?
The category codes are not useful, so I guess we should use their description, maybe with some post processing. Here are some examples: H1, H2, C3

cmdoret commented 1 year ago

With the following query: (link to interactive query editor%0A%20%20FILTER(LANG(%3Fproduct)%20%3D%20%22de%22)%0A%20%20FILTER(LANG(%3Fcategory)%20%3D%20%22de%22)%0A%7D&endpoint=https%3A%2F%2Fculture.ld.admin.ch%2Fquery&requestMethod=POST&tabTitle=Query&headers=%7B%7D&contentTypeConstruct=application%2Fn-triples%2C%2F%3Bq%3D0.9&contentTypeSelect=application%2Fsparql-results%2Bjson%2C%2F%3Bq%3D0.9&outputFormat=table&outputSettings=%7B%22compact%22%3Afalse%7D))

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX cube: <https://cube.link/>
PREFIX dim: <https://energy.ld.admin.ch/elcom/electricityprice/dimension/>
PREFIX schema: <http://schema.org/>

# Median elecricity price per canton

SELECT ?canton ?period ?category ?product ?total
WHERE {
  <https://energy.ld.admin.ch/elcom/electricityprice-canton> cube:observationSet ?obsSet .
  ?obsSet cube:observation ?obs .
  ?obs dim:canton [ schema:name ?canton ] ;
       dim:period ?period ;
       dim:product [ schema:name ?product ] ;
       dim:category [ schema:description ?category ] ;
       dim:total ?total  .
  FILTER(LANG(?canton) = "de")
  FILTER(LANG(?product) = "de")
  FILTER(LANG(?category) = "de")
}

The dataset looks something like this:

nooralahzadeh commented 1 year ago

can we make different columns from "verbrauchskategorien" columns? for example -->2-Zimmerwohnung mit ElektroherdH1: (1'600 kWh/Jahr: 2-Zimmerwohnung mit Elektroherd) --> H1, 1'600 kWh/Jahr, 2-Zimmerwohnung mit Elektroherd

cmdoret commented 1 year ago

can we make different columns from "verbrauchskategorien" columns? for example -->2-Zimmerwohnung mit ElektroherdH1: (1'600 kWh/Jahr: 2-Zimmerwohnung mit Elektroherd) --> H1, 1'600 kWh/Jahr, 2-Zimmerwohnung mit Elektroherd

@nooralahzadeh We could, but wouldn't this make the queries and results pretty complicated? Should we then be extremely specific when writing question query pairs (specifying exact category in question, not very realistic), or have underspecified questions and SELECT all matching verbrauchskategorien columns? (there will be 15 of these columns). On the other hand, keeping them as values would allow to select categories by pattern e.g. LIKE %2_zimmerwohnung%.

Column names could look like this: h1_1600_kwh_pro_jahr_2_zimmerwohnung_mit_elektroherd

cmdoret commented 1 year ago

I realized the above proposed schema does not work well to build queries. Each question would have to be about a specific product as it would be hard to query about specific aspects of one product. To make queries easier, I propose the following:

so that we can ask questions based on:

category size (kwh)
category name
specifics in the description via pattern matching

# A tibble: 7,886 × 7
   canton         period category_name category_size_kwh_per_year category_desc product total
   <chr>           <dbl> <chr>                 <dbl> <chr>         <chr>   <dbl>
 1 Thurgau          2023 C5                   500000 500'000 kWh/… Günsti…  18.0
 2 Solothurn        2023 C5                   500000 500'000 kWh/… Günsti…  18.5
 3 Aargau           2023 C5                   500000 500'000 kWh/… Günsti…  17.3
 4 Tessin           2023 C5                   500000 500'000 kWh/… Günsti…  24.7
 5 Bern             2023 C5                   500000 500'000 kWh/… Günsti…  17.8
 6 Uri              2023 C5                   500000 500'000 kWh/… Günsti…  22.5
 7 Neuenburg        2023 C5                   500000 500'000 kWh/… Günsti…  19.7
 8 Jura             2023 C5                   500000 500'000 kWh/… Günsti…  17.8
 9 Schaffhausen     2023 C5                   500000 500'000 kWh/… Günsti…  18.0
10 Basel-Landsch…   2023 C5                   500000 500'000 kWh/… Günsti…  17.2
# ℹ 7,876 more rows

statistikZH / statbotData