alan-turing-institute / whatwhat

A reimagining of nowwhat in OCaml
MIT License
0 stars 0 forks source link

improving metadata validation robustness #21

Closed callummole closed 2 years ago

callummole commented 2 years ago

In the PR #20 we separate the metadata validation to Github.ml. When running this code we get a lot of errors logged, e.g. a snapshot:

Warning: Metadata Parsing (num: 1079): key min-FTE-percent, null value
Error: Metadata Parsing (num: 1069): Expected 8 metadata keys, got 7
Warning: Metadata Parsing (num: 993): key max-FTE-percent, additional info - 
Warning: Metadata Parsing (num: 993): key min-FTE-percent, null value
Error: Metadata Parsing (num: 1056): Expected 8 metadata keys, got 7
Error: Metadata Parsing (num: 953): Expected 8 metadata keys, got 7
Error: Metadata Parsing (num: 190): Expected 8 metadata keys, got 7
Error: Metadata Parsing (num: 205): Expected 8 metadata keys, got 6
Error: Metadata Parsing (num: 158): Expected 8 metadata keys, got 18
Error: Metadata Parsing (num: 106): Expected 8 metadata keys, got 4
Error: Metadata Parsing (num: 687): Expected 8 metadata keys, got 2
Error: Metadata Parsing (num: 887): Expected 8 metadata keys, got 7
Error: Metadata Parsing (num: 201): Expected 8 metadata keys, got 1
Warning: Metadata Parsing (num: 1219): key turing-project-code, null value
Warning: Metadata Parsing (num: 1219): key max-FTE-percent, additional info - 
Warning: Metadata Parsing (num: 1219): key min-FTE-percent, null value
Error: Metadata Parsing (num: 607): Expected 8 metadata keys, got 16
Error: Metadata Parsing (num: 933): Expected 8 metadata keys, got 7
Error: Metadata Parsing (num: 979): Expected 8 metadata keys, got 7
Error: Metadata Parsing (num: 971): Expected 8 metadata keys, got 7
Error: Metadata Parsing (num: 848): Expected 8 metadata keys, got 7
Error: Metadata Parsing (num: 958): Expected 8 metadata keys, got 7
Error: Metadata Parsing (num: 96): Expected 8 metadata keys, got 16
Warning: Metadata Parsing (num: 1090): key turing-project-code, null value
Warning: Metadata Parsing (num: 1090): key max-FTE-percent, additional info - 

These need inspected to see if these are genuine errors or whether the metadata parsing needs improved to catch different specifications of metadata. For example, on an initial inspection some metadata were specified with --- instead of +++, and they sometimes had spaces after the value entries.

mhauru commented 2 years ago

I expect that there will come a moment when we'll have to go through all the issues, standardise the format, and then start imposing that standard from there on.

triangle-man commented 2 years ago

Proposal (see also reporting.mld):

triangle-man commented 2 years ago

I've written some notes in lib/reporting.mld on the errors and warnings that ought to be reported. (Noting that this document is not the last word!). I will close this issue but we can open another one which is more specific: eg, "fix reporting by github.ml to match documentation" or "ensure all projects have valid metadata blocks".