frictionlessdata / datapackage

Data Package is a standard consisting of a set of simple yet extensible specifications to describe datasets, data files and tabular data. It is a data definition language (DDL) and data API that facilitates findability, accessibility, interoperability, and reusability (FAIR) of data.
https://datapackage.org
The Unlicense
498 stars 113 forks source link

Implementation of JSON Table Schema at Open Power System Data #310

Closed jgmill closed 7 years ago

jgmill commented 8 years ago

Hi everyone, @frictionlessdata/specs-working-group

@danfowler suggested that I present here how we are planning to implement the JSON Table Schema at Open Power System Data .

Below is an en example from the Time series Data Package in YAML-format. Any criticism or suggestions for improvement are welcome:

GIST version

https://gist.github.com/rgrp/3b6f449a45e91932f3120d5603b33fae

Inlined here for easy reference:

name: opsd-time-series

title: Time series

description: Load, wind and solar, prices in hourly resolution

long_description: This data package contains different kinds of time series data relevant for power system modelling, namely electricity consumption (load) for 36 European countries as well as...

homepage: http://data.open-power-system-data.org/time_series/2016-07-14/

documentation: https://github.com/Open-Power-System-Data/time_series/blob/2016-07-14/main.ipynb

version: '2016-07-14'
last_changes: Included data from Energinet.DK, Elia and Svenska Kraftnaet

keywords:
    - Open Power System Data
    - time series
    - power systems
    - in-feed
    - renewables
    - wind
    - solar
    - power consumption
    - power market

geographical_scope: 35 European countries

contributors:
    - name: Jonathan Muehlenpfordt
      email: muehlenpfordt@neon-energie.de
      web : http://neon-energie.de/en/team/

sources: 
    - name: ENTSO-E Data Platform
      web: https://www.entsoe.eu/data/data-portal/consumption/Pages/default.aspx
    - name: ...
      web: ...

schemas:
    15min:
        primaryKey : timestamp
        fields:
            - name: timestamp
              description: Start of timeperiod in UTC
              type: datetime
              format: fmt:%Y-%m-%dT%H%M%SZ
              opsd-contentfilter : true
            - name: load_AT_load
              type: number
              description: Consumption in Austria in MW 
              missingValue: ""
              source:
                  web: https://www.entsoe.eu/data/data-portal/consumption/Pages/default.aspx,
                  name: ENTSO-E
              opsd_properties:
                  Attribute: load
                  Region: AT
                  Variable: load
              first: '2006-01-01'
              last: '2015-21-31' 
            - name: ...
    60min:
        ...

resources: 
    - path: timeseries15.csv
      format: csv
      mediatype: text/csv
      encoding: UTF8
      dialect: 
          csvddfVersion: 1.0
          delimiter: ","
          lineTerminator: "\\n"
          header: true 
      schema: 15min 
      alternative_formats:
          - path: timeseries15.csv
            stacking: Singleindex
            format: csv
          - path: timeseries15.xlsx
            stacking: Singleindex
            format: xlsx
          - path: timeseries15_multiindex.xlsx
            stacking: Multiindex
            format: xlsx
          - path: timeseries15_multiindex.csv
            stacking: Multiindex
            format: csv
          - path: timeseries15_stacked.csv
            stacking: Stacked
            format: csv
    - path: timeseries60.csv
      format: csv
      mediatype: text/csv
      encoding: ...
rufuspollock commented 8 years ago

Great. I've inlined into the comment for easy reference (when i clicked on the link you had i ended up downloading the file ...)

Quick questions:

jgmill commented 8 years ago

You can now find an actual Data Package including the full JSON-file on our website.

The alternative formats are indeed different computed versions of the same file, available for download under "Alternative file formats" on that same website. The idea is to provide different formats that had been requested by our users.

rufuspollock commented 7 years ago

@muehlenpfordt any thoughts re #292 and YAML vs JSON?

rufuspollock commented 7 years ago

FIXED / INVALID. Closing as this was a FYI issue not a bug or feature proposal.

@danfowler possibly better to have these FYI items on the forum in future :-)