OpenEnergyPlatform / data-preprocessing

Repository for data formatting, import of data, data and metadata review, and data curation.
GNU Affero General Public License v3.0
10 stars 7 forks source link

Review: Datasets from EU-legislation #89

Open han-f opened 2 years ago

han-f commented 2 years ago

Issue description

I am submitting a set of metadata for review. The metadata is attached to a series of tables and already available on the OEP. These are projections from European Member States submitted to the European Commission according to EU-legislation.

Workflow checklist

  1. GitHub

    • [x] I have submitted this issue to have metadata and data review documented (Issue #90)
    • [x] Create a new review-branch and push OEMetadata to new branch (feature/eu_leg#89). If this step is too difficult, attach a file with the metadata as a comment in this issue and let the reviewer know.
  2. OEP

    • [x] Upload data to the OEP in schema model_draft (see upload tutorial)
    • [x] Link URL of data in this issue (model_draft.project_nameofdata)
  3. Start a Review

    • [x] Start a pull request (PR) from review-branch to master
    • [x] Assign a reviewer and get in contact
  4. Reviewer section

    • [x] A reviewer starts working on the issue
    • [x] Review data license
    • [x] A reviewer finished working on this issue (and awarded a badge)
    • [x] Update metadata on table
    • [x] Data moved to its final schema
    • [x] Add OEP tags to table
    • [x] Merge PR and delete review-branch
    • [x] Document final links of metadata and data in issue description
    • [ ] Close issue

Metadata and data for review

Here are the links to my data and metadata:

Reviewed and published metadata and data

Final naming and location of the data and metadata after the review are as follows: schema.tablename

eio_ir_article23_t1 (2014-2020)

eio_ir_article23_t3 (2015-2020)

eaa (2016-2020)

others

han-f commented 2 years ago

Data tables including metadata are here: https://openenergy-platform.org/dataedit/view/model_draft?query=EU-legislation cc @wingechr - as info that review process has started.

Ludee commented 2 years ago

We added the metadata for testing the review process and documenting changes.

steull commented 2 years ago

@han-f, according to the licence, it is originally version 2.5 and your version is version 4.0. Are you aware that you are relicensing?

han-f commented 2 years ago

thanks for this - as we created a new database by harmonising terminology acorss all the submission years (they were different( and by introduced foreign key tables etc, does this not allow for a new license as this is very different to how the original data comes about? Also the original data does not come as a table that holds all European Member States but as one table for each Member State for each submission.

Please note - the orginal licences also differ between data with source eionet (up to 2019 submissions) and source reportnet (starting 2021 submission) cc @wingechr

steull commented 2 years ago

We realised that the metadata doesn't follow the latest v.1.5.0. We will add new files. Furthermore, we realised, that the table names are difficult to understand (eio_ir_art...). We suggest to name the files: scenario.eea_eionet_eu_legislationdata... Perhaps, it would be useful to include some parameter names for better understanding.

han-f commented 2 years ago

@wingechr what do you think?

wingechr commented 2 years ago

I think the somewhat cryptic table names are ok, otherwise they become too long. The human readable title exists for that very reason.

About the metadata: what is missing for v.1.5.0?

And thanks for checking

Ludee commented 2 years ago

The cryptic names are OK but we were not able to identify what it means. It would make sense to add additional info to the "description" or explain it here. I suggest an alternative name that is closer to the OEP naming convention: oekoinstitut_eea_eionet_eu_emissions_per_country_2020

I would also suggest to add some text about which data has been collected from the eionet website. A comment of the method of collection would be good as well. Was it by hand or scripted?

Ludee commented 2 years ago

I just made a check of the parameter table and found additional sources. I would suggest to harmonise them for all files.

han-f commented 2 years ago

I just made a check of the parameter table and found additional sources. I would suggest to harmonise them for all files.

The sources provided are correct and they depend on the year the parameter table was submitted. It cannot be harmonised for all files. As the metadata is for each table, this needs to remain specific. It would be great if there was a way on how to - potentially additionally - describe the whole data collection in a separate metadata file or "cover page". That could then also include a bit more of the storyline on how data was collected and where etc. If that was covered in the study factsheets alpha - these may function as such a cover page maybe?

I would also suggest to add some text about which data has been collected from the eionet website. A comment of the method of collection would be good as well. Was it by hand or scripted

What would be the appropriate field in the metadata to add this information? We do not have a "methodology" field and to my understanding the description field would explain the contents.

Would you suggest to add a new "_comment" for tackling both?

cc @wingechr @adelmemariani

steull commented 2 years ago

All metadata looks fine, we have updated them to v1.5.1. @Ludee & @han-f thank you for your support :)

Ludee commented 2 years ago

It seems like we had a misunderstanding. The tables we just moved were not ready for review yet. Please move them back to model_draft.

Ludee commented 2 years ago

@wingechr @han-f please update the first comment with all tables that should stay in scenario.

wingechr commented 2 years ago

it's easier for us if i move the tables back into model_draft (which I will do if no one objects). these are:

wingechr commented 2 years ago

I moved the tables.

NOTE: for some reason, I got 500 Errors, although it seemed to have worked.

wingechr commented 2 years ago

moving tables from scenario to model_draft via API used to work. now it still worked, but threw 500 server side errors and left table artefacts:

grafik

@MGlauer: could you at some point run the mirror task on the producion server? thx

@Ludee I have a hunch that the error is related to the fact that you moved the tables, but you were not the owner?

Ludee commented 2 years ago

I'm not sure if it is due to the permissions. I rather think it is some kind of bug when moving tables with existing metadata. The tables are still in scenario and the suggested renaming has not been implemented. I'm not sure how to best rename existing tables. This can cause a lot of other problems.

han-f commented 2 years ago

I think the tables are back ins scenario, since @MGlauer had deleted the table artefacts and @wingechr had then moved the tables again.

I also think we can keep the table names as they are now and apply new names for tables that we subsequently add. Like this we won't produce further problems as these tables were now already shared with external people and I would be hesitant to do anything that could throw errors.

Ludee commented 2 years ago

OK, I totally agree with the renaming. Let's keep it like this.

I updated the comment above to included all related tables. Please check if the list is complete.

Ludee commented 2 years ago

As discussed today @steull will update all metadata to v1.5.1, add it to the repo and included the review information (issue link and badge). Thanks to everybody for the constructive feedback!

steull commented 2 years ago

I have updated all tables on the OEP with the new version of the metadata

Ludee commented 2 years ago

Please add all metadata strings from the linked OEP tables to the repo. @steull In addition revert the changes by reviewers in the metadata -> remove "Öko-Institut" from title.

steull commented 2 years ago

All metadata are available under the updated link in the description. I also changed the title and removed "Öko-Institut".

han-f commented 2 years ago

I still see the "Öko-Institut" prefix at some tables it, even after refreshing:

https://openenergy-platform.org/dataedit/view/scenario?query=&tags=180&tags=326

cc @wingechr

han-f commented 2 years ago

That said - now all parameter table titles also refer to the Monitoring Mechanism Regulation (MMR), which is not the case for 2021 and also the description seems to have been copied over and is now not reflecting the data sources. grafik

I will edit that back by hand now.

Good the review process will be made easier. Sometimes data owners use the functionality on the OEP to edit and add metadata, and then it may not be reflected in github.