Scheme for configuration of damage functions within physrisk

joemoorhouse commented 3 months ago

It is desirable to be able to define for various asset types, the damage/disruption vulnerability functions that should be applied in the form of configuration. These functions in general describe the vulnerability curve (i.e. curve relating hazard indicator value to relative loss) and uncertainty in the curve.

This issue is to define conventions for specification of the vulnerability functions, in particular:

Asset naming schemes and
Scheme to map asset information onto curves that support a range of use cases: cases where asset attributes are known in detail; cases where asset specifics are unknown (including sectorial approaches)

joemoorhouse commented 3 months ago

In terms of conventions to adhere to, I suggest the following:

We should use Open Exposure Data wherever possible (OED) https://github.com/OasisLMF/ODS_OpenExposureData/blob/develop/OpenExposureData/Docs/OpenExposureData_Spec.xlsx. This particularly applies to the specification of asset attributes, e.g. number of storeys.
Given the importance of sectorial approaches, mapping to NACE codes is desirable (Nomenclature statistique des Activités économiques dans la Communauté Européenne) https://ec.europa.eu/eurostat/web/metadata/classifications
Given the importance to the project of Hazus as a source of vulnerability curves, Hazus conventions, e.g. for occupancy should be used.

FYI, @xbarra, @MichaelTiemann, @EglantineGiraud, @devarfima, @NickKellett

joemoorhouse commented 3 months ago

I note that for different use-cases the ways to specify different vulnerability curves may vary.

1) Asset-specific information available A user may want to a) Specify a set of asset attributes and have the system give the best match to the specification, e.g. occupancy_scheme, occupancy_code, number_of_storeys, first_floor_height for a real estate asset

NickKellett commented 3 months ago

* We should use Open Exposure Data wherever possible (OED) https://github.com/OasisLMF/ODS_OpenExposureData/blob/develop/OpenExposureData/Docs/OpenExposureData_Spec.xlsx.

Should we have a similar ticket for standardizing the output, and if so would we ideally support Open Results Data (ORD) format there?

xbarra commented 3 months ago

@jmcano-arfima

joemoorhouse commented 2 months ago

In terms of representing curves, I see (at least) 3 use-cases. Perhaps the most frequently-used case will be where we have the hazard intensity ($x$) values and the corresponding impact (damage/disruption) ($y$) values. This can be captured by $x$ and $y$ fields each containing an array. We can add a third $z$ field to capture information about the uncertainty in the vulnerability functions.

Case 1: deterministic damage curve provided

The hazard intensity values, $x_i$, are given, $i \in [1 \dots n]$ and the corresponding impacts $y_i$.

$x = [x_1, x_2, \dots, x_n]$
$y = [y_1, y_2, \dots, y_n]$
$z$ is empty

Case 2: mean and standard deviation provided

The hazard intensity values, $x_i$, are given, $i \in [1 \dots n]$.

$f_i(y) = \mathbb{P}(Y=y|x_i)$
$\mu_i = \int f_i(y) y dy$
$\sigma_i^2 = \int f_i(y) y^2 dy - \mu_i^2$

The means are given in $y$ and the standard deviations in $z$.

$x = [x_1, x_2, \dots, x_n ]$
$y = [\mu_1, \mu_2, \dots, \mu_n ]$
$z = [\sigma_1, \sigma_2, \dots, \sigma_n ]$

Case 3: discrete piece-wise linear cumulative density function (CDF) provided

The hazard intensity values, $x_i$, are given, $i \in [1 \dots n]$.

$F_i(y) = \mathbb{P}(Y \leq y|x_i)$

The CDF, $F_i(y)$, is given for points $yj$, $j \in [1 \dots m]$. $F{ij} = \mathbb{P}(Y \leq y_i|x_i)$

$x = [x_1, x_2, \dots, x_n ]$
$y = [y_1, y_2, \dots, ym ]$
$z = [[F{11}, F{12}, \dots, F{1m}], [F{21}, F{12}, \dots, F{2m}], \dots, [F{n1}, F{n2}, \dots, F{nm}]]$

os-climate / physrisk