KSSno / Klimakverna-pilot-KAPy

Code repository production chain pilot using KAPy. Affects WP2, WP3 and WP3B
0 stars 0 forks source link

(WIP) Test case 7: Metadata filling for variables and indices #9

Open ketilt opened 1 week ago

ketilt commented 1 week ago

Related user stories (same as test case 2):

Sources: \ https://github.com/KSSno/Klimakverna-WP1-WP2-WP3b-WP5/issues/11 \ https://docs.google.com/document/d/12joGOmBR4xNjGW45rXxaNbIMv8jDtqAZL0duI7WlwGk/edit?usp=sharing

Input files

This task relies on output netCDF files from test cases 1 and 5.

For list of required metadata attributes, see: https://docs.google.com/spreadsheets/d/1L4dwsB3iH11kxyIvqtnhpgJyaKDWvoGgDCg0klff83c/edit#gid=1886269309

Method

This test case builds upon test case 1 (producing a climate index, being a statistical aggregation of other datasets) and loosely on test case 5 (producing a climate variable, a derivative dataset that does not aggregate, and which keeps the same time resolution as its parents). Separate subtasks should be created for each of these cases if needed. The aim of the current test case, is to ensure that when netCDF files are written, they are written with the correct metadata as per the spreadsheet above.

Metadata fields can have different origins, including:

The purpose of this test case is to identify this distinction; to conclude what information we should require a "dataset producer" to provide; and to implement the proper setting of metadata fields.

If possible, the metadata filling should be considered an independent step from the creation of new files. This is because we might want to fill or correct metadata in previously generated files without touching the data. Technically, this means any information kept in-memory during file creation, will not be available when filling in the metadata of that same file. We want to find out if this is possible, if the path and basic contents of a file is sufficient for identifying it and fetching or deducing the correct metadata.

Output