OpenEnergyPlatform / academy

The Open Energy Academy is a collection of courses, tutorials, and questions for the Open Energy Family
https://openenergyplatform.github.io/academy/
GNU Affero General Public License v3.0
16 stars 6 forks source link

Tutorial: How to generate database conform data #48

Closed klarareder closed 3 years ago

klarareder commented 5 years ago

relates to issue: #9

klarareder commented 5 years ago

Database conform data The following checklist will help you to prepare your data in a way that you will have less trouble uploading it into the OEP. The main purpose is to have your data machine-readable and simultaneously for humans easy to understand.

  1. Only work with tables
  2. If you work with Excel: use only ONE table per excel worksheet (see figure poor example and good example on how to split one table into multiple tables).
  3. Use English table names
  4. Use consistent file names. This will facilitate the upload. a. Good example: energy_load_germany_pv, energy_load_germany_wind, hence: energy_loadgermany* b. Poor example: energy_pv_load, wind_energie_Deutschland_version3
  5. Try to keep all files in one folder or in folders with consistent names as described in 4. to be easily machine readable
  6. Follow the conventions for columns: a. Each column has only one entry per cell (see figure good example) b. Speaking names (see figure good example) or: good example: engery_load_germany_pv vs. poor example: lpv_vs4 c. Follow naming conventions e.g. SI units if possible d. Check if it is possible to transpose your file (e.g. in case of years as columns, in case of parameters as columns), see figure poor example and good example. A tool helping you with this can be found at: #OpenEnergyPlatform/oeplatform/issues/350 e. Never start column names with a number (will not work during upload), figure good example f. Use English column names g. Make sure each column contains only specific datatypes: string (e.g. “this is a string”), integer (e.g. 1,2,-5), float (e.g. 1.544) … see figure good example
  7. Think about the decimal seperator and deliminator you are using. Compare with https://github.com/OpenEnergyPlatform/examples/wiki/Metadata-Description
  8. Think about what you mean by missing values: if you implicit mean zero enter ‘0’. If you mean there were no data available leave it empty or set 'NULL'. Empty fields will be changed to NULL during upload.
  9. Make sure you have one or more primary keys so that each entry is unique. A primary key is a unique identifier. a. Example: first name, surname, passport_nr The passport_nr is a primariy key, because it is unique and identifies each person
  10. In case of dates a. Always use the same format. Always use ISO 8601 https://en.wikipedia.org/wiki/ISO_8601 b. Think about your time zone c. In case of time intervals, do you have a left stamp, right stamp or middle stamp?
  11. Implement your model that the result has always the same format: that will make your life easier to work with the OEP but also for your own post-processing
  12. If you have multiple tables, make sure they can be linked to one another. Ideally you should link tables to other tables via their id column. Otherwise you might collect double entries in the database. a. e.g. Person with first and surname and table with cars and car owners surname. And see figure: good example

This is a poor and a good example. This gives an Idea how to solve these problems, there are many good ways to solve it and two possibilities are show here.

poor example

grafik

good example In the good example there are two options how to split the timer series.

  1. This option is used in case of many wind turbine and solar park data:

grafik

  1. This option is used if there are not as many wind turbines and solar parks

grafik

klarareder commented 5 years ago

I have have the following two questions concerning the OEP, numbering from above:

  1. OEP requires '.' or ',' or does it matter?
  2. OEP and nodata values?
klarareder commented 5 years ago

@christian-rli I assigned you for this issues because we are both working on the tutorials issue. I made the first tutorial. Please feedback on:

  1. Content
  2. Spelling etc.
  3. The question from above
  4. anything else
christian-rli commented 5 years ago

This is a good collection of data best practices @klarareder Some quick feedback:

6a: should this say each cell? 6d: "In case of years as columns" should not be in brackets, but start the sentence in my opinion. 7: wording: "decimal separator" instead of "separation". we need to differentiate "decimal separator" (symbol between integer part and fraction part of a number) and delimiter (symbol separating fields/columns/cells). There is no one mandatory way / best practice / one correct way of what symbols to use, only culturally different ways. people do have to specify what system they're using, though. As this is slightly relevant, see here for a complete list of metadata values with a short description that I just finished: https://github.com/OpenEnergyPlatform/examples/wiki/Metadata-Description 8: In a database, cells without values are always "NULL". After uploading data with empty cells, when you query the values of those empty cells you get NULL as a value. You can also explicitly set empty cells NULL, but it's not really necessary. Also, wording: "Think about what..." 10a: Always use the same format. Always use ISO 8601. https://xkcd.com/1179/ , https://en.wikipedia.org/wiki/ISO_8601

  1. Ideally you should link tables to other tables via their id column. Otherwise you might collect double entries in the database.
klarareder commented 5 years ago

@christian-rli thanks for the good feedback. I changed it to your suggestions. Only 6d I reworded: not only in case of years as colums one can transpose data but also in cases like having different parameters etc.

klarareder commented 5 years ago

@christian-rli I placed the obove text on the OEP. If you check out features/tutorial-app you can see the changes and check if everything is ok (links working, no copy paste errors, texts ok etc.) I forgot to mention the Issue in the commit, that is why it is not shown here, sorry. Once you agree with it, we can start a pull request and close this issue.

klarareder commented 5 years ago

@MGlauer I pushed everything which was shown on my change of file list, but I noticed, that the static folder is not shown and thus I am not sure if I pushed the images.

MGlauer commented 5 years ago

Did you add the folder to your stage (e.g. with git add)

christian-rli commented 5 years ago

@klarareder I did not read through the whole thing yet, but I had a quick look and noticed that a) links all worked, but b) there were no images. So it seems like you did not include your static folder (or the images).

christian-rli commented 5 years ago

@klarareder Reading over the text I noticed a couple things. At one point you write "speaking names". I'm not really sure what you mean by that, but your example makes me guess something like "expressive names"? Also, in 7 there is a "deliminator" that's probably meant to read "delimiter". The text is fine otherwise. Just add the images and it's ready to go.

klarareder commented 3 years ago

@christian-rli is this included in the new tutorial page in the upcoming release next week? Or has this to be added to the Tutorials once they are online?

klarareder commented 3 years ago

done: https://openenergy-platform.org/tutorials/7/