KSSno / KAPy

Klimaatlases in Python
MIT License
0 stars 0 forks source link

Test case 2: Metadatasjekk for biasjusterte data basert på CORDEX #2

Open ketilt opened 5 months ago

ketilt commented 5 months ago

Relaterte brukerhistorier:

Kilder: \ https://github.com/KSSno/Klimakverna-WP1-WP2-WP3b-WP5/issues/11 \ https://docs.google.com/document/d/12joGOmBR4xNjGW45rXxaNbIMv8jDtqAZL0duI7WlwGk/edit?usp=sharing

Inputfiler

Bruk følgende inputfiler, som er biasjusterte og nedskalerte data fra EURO-CORDEX fra CMIP6:

/lustre/storeC-ext/users/kin2100/NVE/EQM/*/pr/[hist,rcp26,rcp45]/*nc4
/lustre/storeC-ext/users/kin2100/NVE/EQM/CMIP6/*/pr/*/*nc4
/lustre/storeC-ext/users/kin2100/MET/3DBC/application/*/pr/[hist,rcp26,rcp45]/*nc4
/lustre/storeC-ext/users/kin2100/MET/3DBC/application/CMIP6/*/pr/*/*nc4

For liste over nødvendige metadataattributter, se følgende fil: https://docs.google.com/spreadsheets/d/1L4dwsB3iH11kxyIvqtnhpgJyaKDWvoGgDCg0klff83c/edit#gid=1886269309

Metode

En metadatavalidator gis en egen regel i Snakefile, som kan kalles via snakemake på kommandolinje. Validator kjøres for ei definert fil. Å automatisk sjekke flere filer eller ei hel mappe med filer, er ikke nødvendig.

S-ENDAs egen metadatavalidator finnes her. Denne kan brukes for å sjekke om en gitt fil har riktige ACDD-attributter/metadata.

En full validering bør sjekke om alle forventede attributter finnes og om de er gyldige. Det må vurderes om dette lar seg gjøre med S-ENDAs validator, eller om det bør suppleres med egen kode for Klimakverna.

Output

ketilt commented 2 weeks ago

@shamlymajeed Thanks for the feedback and ideas this week's meeting. Let me know if the flow chart I shared on chat makes it easier to see the bigger picture. It didn't end up quite as clear as I had hoped.

For the part of this test case concerning variable attributes, there is overlap with test case 7 (https://github.com/KSSno/KAPy/issues/7). That test case concerns setting metadata after new files are created. And it might be unnecessary to separate this from validation. So we might want to combine the two, or to consider test case 7 as building upon test case 2. We could discuss this in a later meeting.

Setting (and checking) descriptive variable attributes will most likely require some form of reference file for every given dataset. That is something I have added in test case 7. The spreadsheet linked from both test cases is an attempt at defining those attributes for each planned dataset, but we'll need something machine readable, like a yaml file with a set of required fields. This will then supplement the S-ENDA validation of required keywords etc.

ketilt commented 1 week ago

COPY QA checked files to /lustre/storeC/development/output/