wmo-im / iwxxm-codelists

Code list management for WMO published content
0 stars 6 forks source link

Add CSVs to Support Creation of TTLs for Registry #8

Open amilan17 opened 3 years ago

amilan17 commented 3 years ago

1. Conversion of existing code lists in TTL format to CSV
2. Adjustment to the structure and scripts to fit into the existing workflow in a similar way as the repository for WIGOS Metadata Standard

amilan17 commented 3 years ago

potential workflow

  1. Download CSVs from the Codes Registry API: http://codes.wmo.int/ui/sparql-form
  2. Cross check with codes in GitHub repo

potential directory structures

Option A

- 306.csv
- 49-2.csv
- common.csv
- scripts
- ttl
    - 306
    - 49-2
    - common
- README.md

Option B

- 306
   - 306.csv
   - ttl
      - *.ttl 
- 49-2
   - AirWxPhenomena.csv
   - SigWxPhenomena.csv
   - *.csv
   - ttl
      - *.ttl
- common
- scripts
- REAMDE.md    

Current directory structure (Mar 2021)

- 306
   - 4678
      -+DS.ttl
      - +DZ.ttl
      - ..etc. 
   - 4678.ttl 
- 49-2
   - AirWxPhenomena
      - BKN_CLD.ttl
      - FRQ_CD.ttl
      - ...etc. 
   - AviationColourCode
   - ...
   - AirWxPhenomena.ttl
   - AviationColourCode.ttl
   - ... 
- common
- README.md
blchoy commented 3 years ago

As discussed (@amilan17, @jkorosi and Choy), we will experiment this new structure under Jan's own GitHub account until it is mature to move to wmo-im.

I think we cannot deviate too far off from the hierarchy of the code tables in the registry. At the same time, we may also want to discriminate the sub-directories the team will be working on and the one containing files that will be used to update the registry.

I would like to suggest the following structure of the repository:

- IWXXM-CodeList    <- Team members will be mostly dealing with this
  - 306
    - 4678
      - +DS.csv
      - +DZ.csv
      - ...
    - 4678.csv
  - 306.csv
  - 49-2
    - AirWxPhenomena
      - BKN_CLD.csv
      - FRQ_CD.csv
      - ...
    - AirWxPhenomena.csv
    - ,,,
  - 49-2.csv
- TTL               <- This is supposed to contain the outputs of Mark's script for upload to the registry                 
  - 306
    - 4678
      - +DS.ttl
      - +DZ.ttl
      - ...
    - 4678.ttl
  - 306.ttl
  - 49-2
    - AirWxPhenomena
      - BKN_CLD.ttl
      - FRQ_CD.ttl
      - ...
    - AirWxPhenomena.ttl
    - ,,,
  - 49-2.ttl
- bin               <- Or 'scripts' which contains the scripts used for CI and if necessary CD
- documents         <- Don't know if this is still needed
- README.md

I am not too sure if the code tables under 'common' should be handled by this team. Ditto the CSV or TTL files for the descriptions of the main page, as we may want to change the descriptions there too:

image

jkorosi commented 3 years ago

I create a new branch https://github.com/jkorosi/IWXXMCodeLists/tree/Add-CSVs-to-Support-Creation-of-TTLs-for-Registry-%238. I decided to create two directories (CSV and TTL) because there will be more directories (e.g. for scripts) and I don't want to mix CSV files with them. I also tried to unify the column names, so there are empty columns in some tables. But I believe that properties as e.g. description should be inserted later.

blchoy commented 3 years ago

Thanks Jan. Could @amilan17:

  1. confirm the columns are adequate for the publication of tables in WMO No.306? I checked that WMDS tables have a different structure and am not sure if we need to align with anybody else right now or we take the lead to create a new structure.
  2. discuss with @marqh where to put version information of each table, how this can pass on to the TTL files and be inserted into Codes Registry, once the table versioning feature is available on the registry?

I would also like to add one more column to indicate whether a row is new or has been modified compared with those online/some local RDF files downloaded from the registry, so that the CI script could safely skip (or do some checking like if there is really a change?) them when committing. This should also facilitate checking. Views are most welcome.

I am also bringing this to the attention of @jitsukoh to see if she has any views on this.