HXLStandard / libhxl-python

Python support library for the Humanitarian Exchange Language (HXL) data standard.
The Unlicense
41 stars 11 forks source link

No validation for core schema entites #364

Open IanHopkinson opened 9 months ago

IanHopkinson commented 9 months ago

The HXL standard defines a set of core hashtags and attributes, stored here: https://data.humdata.org/dataset/hxl-core-schemas

However, the validator pays no attention to these - as a user I want to check my hashtags and attributes are in the core set so that I only add non-standard hashtags and attributes deliberately.

I have implemented this check elsewhere using a manually compiled list of hashtags and attributes (before I found a list above!). A verbose example output for these checks is shown below.

I'm happy to try to implement this myself in libhxl with guidance.

VALID: '#country' applied to field 'Country' in 'insecurity-insight.shcc' is a valid HXL tag
VALID: 'name' applied to field  'Country' in 'insecurity-insight.shcc' is a core HXL attributes
NOT VALID: '#bogus' applied to field 'Number of reported incidents' in 'insecurity-insight.shcc' is not a core HXL tag
VALID: 'num' applied to field  'Number of reported incidents' in 'insecurity-insight.shcc' is a core HXL attributes
VALID: '#affected' applied to field 'Number of health workers killed' in 'insecurity-insight.shcc' is a valid HXL tag
NOT VALID: 'healthworker' applied to field  'Number of health workers killed' in 'insecurity-insight.shcc' are not core HXL attributes
VALID: '#affected' applied to field 'Number of health workers kidnapped' in 'insecurity-insight.shcc' is a valid HXL tag
NOT VALID: 'healthworker,kidnapped' applied to field  'Number of health workers kidnapped' in 'insecurity-insight.shcc' are not core HXL attributes
VALID: '#affected' applied to field 'Number of health workers arrested ' in 'insecurity-insight.shcc' is a valid HXL tag
NOT VALID: 'healthworker,arrested' applied to field  'Number of health workers arrested ' in 'insecurity-insight.shcc' are not core HXL attributes
VALID: '#affected' applied to field 'Number of health workers injured' in 'insecurity-insight.shcc' is a valid HXL tag
NOT VALID: 'healthworker' applied to field  'Number of health workers injured' in 'insecurity-insight.shcc' are not core HXL attributes
VALID: '#affected' applied to field 'Total health worker assaulted' in 'insecurity-insight.shcc' is a valid HXL tag
NOT VALID: 'healthworker,assaulted' applied to field  'Total health worker assaulted' in 'insecurity-insight.shcc' are not core HXL attributes
VALID: '#indicator' applied to field 'Total number of attacks on facilities which reported destruction' in 'insecurity-insight.shcc' is a valid HXL tag
NOT VALID: 'destroyed,health_facility' applied to field  'Total number of attacks on facilities which reported destruction' in 'insecurity-insight.shcc' are not core HXL attributes
VALID: '#indicator' applied to field 'Total number of attacks on facilities which reported damage' in 'insecurity-insight.shcc' is a valid HXL tag
NOT VALID: 'damaged,health_facility' applied to field  'Total number of attacks on facilities which reported damage' in 'insecurity-insight.shcc' are not core HXL attributes
VALID: '#indicator' applied to field 'Forceful entry into health facility' in 'insecurity-insight.shcc' is a valid HXL tag
NOT VALID: 'armed_entry,health_facility' applied to field  'Forceful entry into health facility' in 'insecurity-insight.shcc' are not core HXL attributes
VALID: '#indicator' applied to field 'Occupation of health facility' in 'insecurity-insight.shcc' is a valid HXL tag
NOT VALID: 'occupy,health_facility' applied to field  'Occupation of health facility' in 'insecurity-insight.shcc' are not core HXL attributes
VALID: '#indicator' applied to field 'Health transportation destroyed' in 'insecurity-insight.shcc' is a valid HXL tag
NOT VALID: 'health_transport,destroyed' applied to field  'Health transportation destroyed' in 'insecurity-insight.shcc' are not core HXL attributes
VALID: '#indicator' applied to field 'Health transportation damaged' in 'insecurity-insight.shcc' is a valid HXL tag
NOT VALID: 'health_transport,damaged' applied to field  'Health transportation damaged' in 'insecurity-insight.shcc' are not core HXL attributes
VALID: '#indicator' applied to field 'Health transportation stolen/highjacked' in 'insecurity-insight.shcc' is a valid HXL tag
NOT VALID: 'health_transport' applied to field  'Health transportation stolen/highjacked' in 'insecurity-insight.shcc' are not core HXL attributes
VALID: '#indicator' applied to field 'Looting, theft, robbery, burglary of health supplies' in 'insecurity-insight.shcc' is a valid HXL tag
NOT VALID: 'health_supplies,taken' applied to field  'Looting, theft, robbery, burglary of health supplies' in 'insecurity-insight.shcc' are not core HXL attributes
VALID: '#indicator' applied to field 'Obstruction to health care' in 'insecurity-insight.shcc' is a valid HXL tag
NOT VALID: 'health_obstruction' applied to field  'Obstruction to health care' in 'insecurity-insight.shcc' are not core HXL attributes
NOT VALID: Unknown tags across all columns: #bogus
NOT VALID: Unknown attiubutes across all columns: healthworker,assaulted,health_supplies,occupy,health_transport,arrested,health_facility,health_obstruction,destroyed,armed_entry,taken,kidnapped,damaged