[Schema] should taxonomy be optional in vulnerability?

GFDRR / rdl-standard

The Risk Data Library Standard (RDLS) is an open data standard to make it easier to work with disaster and climate risk data. It provides a common description of the data used and produced in risk assessments, including hazard, exposure, vulnerability, and modelled loss, or impact, data.

https://docs.riskdatalibrary.org/

Creative Commons Attribution Share Alike 4.0 International

16 stars 1 forks source link

[Schema] should taxonomy be optional in vulnerability? #202

Closed odscjen closed 1 year ago

odscjen commented 1 year ago

From example being developed in https://github.com/GFDRR/rdl-standard/issues/135#issuecomment-1686621456 the dataset doesn't have a specific taxonomy scheme that it's used. Should we make this field optional instead of required?

If it does need to stay as required we should add a code to 'classification_schema.csv' to cover these scenario's for consistency and add a bit to the guidance to state what to use in this situation.

Code	Title	Definition	Source	Category
internal	Internal	The categories defined within the dataset methodology. These have not been explicitly taken from a declared taxonomy scheme.

stufraser1 commented 1 year ago

Good point - some vulnerability relationships relate to only a general occupancy type e.g. 'residential', 'commercial', 'industrial'.

Either we (1) assign taxonomy type based on this - we can create one in ODS/GEM taxonomies for general residential with all other taxonomy string components as unknown, or (2) make taxonomy optional and fall back on the occupancy type only - which now I look for it, isn't within the schema or codelists (was this removed, we had it in at one point).

We need some way to tie the V relationships to a type of exposure - and using exposure_ctegory to tie it to 'buildings' is not enough.

odscjen commented 1 year ago

We did remove occupancy type as this isn't necessarily always the same across every asset in a dataset.

vulnerability.taxonomy as it stands doesn't hold the actual taxonomy codes (for the same reason of them not being the same across every asset in a dataset), it gives the name of the scheme that the taxonomy codes in the dataset are taken from, e.g. GED4ALL.

matamadio commented 1 year ago

Taxonomy should be optional, because exposure grouping is often custom.

some vulnerability relationships relate to only a general occupancy type e.g. 'residential', 'commercial', 'industrial'.

That is true, yet datasets could include one or more of these occupancy types. Right now we identify only category (buildings, infrastructure, agriculture, population, natural environment). Occupancy type would refer only to buildings. I would keep occupancy details as part of data instead of metadata - at least for this release.

stufraser1 commented 1 year ago

Not having this in the metadata reduces search capability - it is useful for users searching V functions to filter by those suitable for Residential buildings, or by a certain construction type, for example (see OpenQuake tool example), but yes by including in metadata we potentially introduce long string / array of occupancy types

odscjen commented 1 year ago

For this iteration of the standard I think we should make it optional. The next version of the standard could look to work out if occupancy is worth putting back in, and in conjunction making taxonomy (the name of the taxonomy scheme) required. @matamadio @stufraser1 is this okay?

odscjen commented 1 year ago

@stufraser1 are you okay with the above suggestion?