[Proposal] Vulnerability impact object

odscrachel commented 1 year ago

What is your proposed change?

Create an impact object to hold all of the impact description fields with the vulnerability component. This object will also appear in the loss component with a slightly different description.
Create a new codelist for impact.unit
Create a new codelist for impact.base_data_type

Title	Field name	Description	Type	Codelist
Vulnerability impact	`impact`	The details of the vulnerability impact values produced in the modelled scenario(s).	object
Impact type	`impact.type`	The type of the impact calculated.	string	direct, indirect, total
Impact metric	`impact.metric`	The impact metric type.	string	absolute, relative
Impact unit	`impact.unit`	The name of the unit the impact value is expressed in.	string
Impact base data type	`impact.base_data_type`	The type of data used to calculate the impact values.	string	simulated, inferred, observed

stufraser1 commented 1 year ago

We haven't specified any codes already for impact.unit?

odscrachel commented 1 year ago

In DDH_v0.2.xlsx for the field vln_impact_metric from loss as opposed to vulnerability, we have the following codes

Annual Average Losses
Annual Average Loss Ratio
Probable Maximal Loss

Would these be relevant here? There was also a conversation here

duncandewhurst commented 1 year ago

@odscrachel I think those are metrics rather than units. In https://rdl-standard.readthedocs.io/en/sphinx/data_model/loss.html#flood-loss-scenarios-for-afghanistan-2050, the unit is 'USD'.

@stufraser1 are impacts and losses always expressed as financial values or are there other types of value and if so please could you provide some examples?

QUDT offers an ontology of units, which we might be able to reuse.

matamadio commented 1 year ago

@stufraser1 are impacts and losses always expressed as financial values or are there other types of value and if so please could you provide some examples?

Yes there might be other non-monentary values for impact.unit e.g. area, % of area, % of total value.

duncandewhurst commented 1 year ago

Ah, I see. Area, % of area and % of total value aren't units exactly. In QUDT terms, they are a mix of quantity kinds (vocabulary) and units (vocabulary):

Description	Quantity kind	Unit
area	Area	Not specified, but might be M2 (Square Metre), for example
% of area	AreaRatio	PERCENT
% of total value	CurrencyRatio seems logical, but QUDT lacks such a code.	PERCENT

It gets quite complicated to structure that level of detail and that is probably only necessary if there is a use case for performing calculations based on those fields.

Looking at the examples in this issue, I think a better approach might be:

metric: an object with two fields:
- .type: absolute, relative
- .description: a free-text description of the metric
unit: specify that this field should only be used for non-monetary metrics and refer to the QUDT unit codelist so that its values can be validated
Add a currency field to be used only for monetary metrics and refer to the ISO4217 currency codelist

Here's how that would look for each example. Let me know if I missed anything.

Area (assuming square metres)


{
  "impact": {
    "metric": {
      "type": "absolute",
      "description": "Area"
    },
    "unit": "M2"
  }
}

% of area

{
  "impact": {
    "metric": {
      "type": "relative",
      "description": "% of area"
    },
    "unit": "PERCENT"
  }
}

% of total value

{
  "impact": {
    "metric": {
      "type": "relative",
      "description": "% of total value"
    },
    "unit": "PERCENT"
  }
}

Annual Average Losses (assuming in USD)

{
  "impact": {
    "metric": {
      "type": "absolute",
      "description": "Annual average losses"
    },
    "currency": "USD",
  }
}

Annual Average Loss Ratio (assuming percent)

{
  "impact": {
    "metric": {
      "type": "relative",
      "description": "Annual average loss ratio"
    },
    "unit": "PERCENT"
  }
}

Probable Maximal Loss (assuming currency)

{
  "impact": {
    "metric": {
      "type": "absolute",
      "description": "Probable maximal loss"
    },
    "currency": "USD"
  }
}

stufraser1 commented 1 year ago

This proposal has taken a wrong turn somewhere down the line, we should have clarified this better in the spreadsheet. These impact metrics and units are relevant to loss.impact (with some additions like PML ratio to be discussed).

But vulnerability.impact should refer to the y-axis of a vulnerability function, fragility function, damage-to-loss function (where the hazard parameter is on the x-axis) or vulnerability index. Y-axis options are relative, but represent damage as a ratio of total cost of a building or as ratio of total number of people in a vulnerability function, or the probability of exceeding minor damage/moderate damage/sever damage/collapse in a fragility function:

Title	Field name	Description	Type	Codelist
Vulnerability impact	`impact`	The details of the vulnerability impact values produced in the modelled scenario(s).	object
Impact type	`impact.type`	The type of the impact calculated.	string	direct, indirect, total
Impact metric	`impact.metric`	The impact metric type.	string	damage ratio, mean damage ratio, probability of exceeding damage limit state, loss ratio, total loss, number of casualties(fatalities), casualty(fatality) ratio, annual average loss, downtime, damage index
Impact unit	`impact.unit`	The name of the unit the impact value is expressed in.	string	percent
Impact base data type	`impact.base_data_type`	The type of data used to calculate the impact values.	string	simulated, inferred, observed

Refer to vulnerability schema reports and Phase 3 report here https://riskdatalibrary.org/resources

matamadio commented 1 year ago

Agree, Duncan proposal would be more related to the Loss impact object than the V model calculation.

duncandewhurst commented 1 year ago

Ah, apologies, I thought we were aiming for a common model for impact across vulnerability and loss.

Looking at the impact metrics in https://github.com/GFDRR/rdl-standard/issues/75#issuecomment-1568223027, I still think that we should separate out unit and currency and use the codelists proposed in https://github.com/GFDRR/rdl-standard/issues/75#issuecomment-1567580818:

`impact.metric`	`impact.unit` (from QUDT)	`impact.currency` (from ISO-4217)
damage ratio	PERCENT
mean damage ratio	PERCENT
probability of exceeding damage limit state	PERCENT
loss ratio	PERCENT
total loss		USD (for example, could be another currency)
number of casualties(fatalities)	(not applicable, a count is unitless)
casualty(fatality) ratio	PERCENT
annual average loss		USD (for example, could be another currency)
downtime	DAY (for example, could be HR or another unit of time)
damage index	(not applicable)

QUDT is a very long codelist so we can recommend a subset of its codes. From the examples in this issue, looks like we need:

PERCENT
Units of time (HR, DAY, MO, YR)
Units of area (M2, not sure if others are relevant)

Does that sound good? Let me know if I misunderstood anything.

matamadio commented 1 year ago

Ah, apologies, I thought we were aiming for a common model for impact across vulnerability and loss.

They could be different, as the losses might be a translation of the vulnerability into a different unit. E.g. vulnerability model gives impact as % of total value; losses calculated from this model may be translated into USD or other.

I still think that we should separate out unit and currency

I don't see the practical advantage, and also find confusing to have empty unit field for currency. Could it be:

impact.unit

Percent (%)
Units of time (HR, DAY, MO, YR)
Units of area (m2, hectares, km2, more...)
Monetary (USD, PPP, more...)

Although this would cover most cases, I'm worried we can't really anticipate all possible units. Or can we?

duncandewhurst commented 1 year ago

Is your proposal to model the kind of quantity being measured (time, area, currency etc.) rather than the specific unit in which it is measured (hours, m2, USD etc.)?

If so, that seems logical to me, since it should be possible for users to convert between units within a particular kind of quantity (e.g. minutes to hours, m2 to hectares, USD to GBP etc.), but not necessarily possible to convert between different kinds of quantity.

However, I would name the field impact.quantityKind and use the following codelist, based on the QUDT quantity kind vocabulary:

Area
Count
Currency
DimensionlessRatio (percentages and probabilities etc.)
Time

We can't anticipate all possible quantity kinds, but we can:

Defer to the QUDT vocabulary, which attempts to do that.
Make quantity kind an open codelist so that publishers can use their own code if no QUDT code matches the semantics of the kind of quantity that their dataset measures.

stufraser1 commented 1 year ago

We are still referencing the hazard intensity measure imt when describing the vulnerability curve so we should for consistency and ability to search/filter, include the unit for the y-axis (the loss parameter).

This was the original intent, which we should retain:

Loss parameters e.g.:

Economic loss total (ELT): ELT represent the total level of economic loss expressed in USD.
Fatality total (FT): Total number of fatalities related to a hazard.
Fatality rate (FR): Fatality rate is defined as the ratio of the total number of disaster related fatalities (FT) to the total population exposed.
Relative loss (Rloss): The loss is shown as a fraction of the percentage of the estimated total replacement value of property or area and its contents.
Damage index (DI): Damage Index judges the proportion of the replacement cost of the structure.

I don't think there is any way round having the metric AND the unit. impact.quantityKind doesn't give enough information for searching and understanding the curve in the way we intended.

Could we make pairs in the codelist, which would give the metric and the unit, which would signal which unit could be used with which metric?

Although this would cover most cases, I'm worried we can't really anticipate all possible units. Or can we? We can create a codelist to cover most cases (not currencies, but all the rest) -- vulnerability curves do not have an endless list of loss parameters.

odscjen commented 1 year ago

Could we make pairs in the codelist, which would give the metric and the unit, which would signal which unit could be used with which metric?

This can be tricky when it comes to validating data. One possible way to do this is to make the codelist 2-tier, e.g.

Code	Title	Description
fatality.total	Fatality total	The total number of fatalities related to the hazard event
fatality.percent	Fatality ratio	The ratio of the total number of disaster related fatalities (FT) to the total population exposed expressed as a percentage.
downtime.days	Downtime days	The number of days of downtime
downtime.hours	Downtime hours	The number of hours of downtime
economicLoss.USD	Economic loss in USD	The total economic loss expressed in USD

But this still isn't easy to validate and potentially confusing...

This could be an open codelist with instructions on how to create this compound type of code.

An alternative is to create a field for each possible metric and where more than one unit is applicable create a codelist for each.

matamadio commented 1 year ago

We are still referencing the hazard intensity measure imt when describing the vulnerability curve so we should for consistency and ability to search/filter, include the unit for the y-axis (the loss parameter).

See #5 resolution:

Propose to use an open imt list. FIltering on hazard and vulnerability that align can be filtered on hazard and process type rather than imt. In the same way that they are filtered on occupancy category not full taxonomy string.

stufraser1 commented 1 year ago

So, proposing to use open codelists and guidance for impact.metric and impact.unit to keep it relatively straightforward. We have had enough experts involved in the process to be confident we've covered most of the metrics and units.

Moving to 'agreed and ready' on project board

stufraser1 commented 1 year ago

***SUPERCEDED impact_metric.csv

Code,Label,Definition absolute,Absolute,"Impact expressed in absolute terms such as currency, number of people or assets, kilometres, square kilometres, or time." relative,Relative,"Impact expressed in relative terms such as a percentage, or proportion of a total"

stufraser1 commented 1 year ago

impact_type.csv

Code,Label,Definition direct,Direct,The physical or structural impact caused by the disaster such as the destruction of infrastructure caused by the force of physical processes in an event. indirect,Indirect,"Subsequent or secondary results of the initial damage or destruction, such as business interruption losses. Indirect losses may be incurred far away from the original affected area or later in time." total,Total,Combination direct and indirect losses or impact.

stufraser1 commented 1 year ago

data_calculation_type.csv

Code,Label,Definition simulated,Simulated,Impact data originates from numerical simulation. inferred,Inferred,Impact data that has been processed from original observation data. observed,Observed,Impact data originates from observation such as post-event damage surveys.

EDIT: renamed this as it is also being used in hazard for the calculation_method field so we need a name that makes sense in both use cases.

odscjen commented 1 year ago

@stufraser1 should impact_metric.csv not be "damage ratio, mean damage ratio, probability of exceeding damage limit state, loss ratio, total loss, number of casualties(fatalities), casualty(fatality) ratio, annual average loss, downtime, damage index" as the codes?

(or @matamadio as I think Stu might be away for some of this week)

stufraser1 commented 1 year ago

Yes you're correct. Will revise as soon as I can.

GFDRR / rdl-standard

[Proposal] Vulnerability impact object #75

What is your proposed change?