GFDRR / rdl-standard

The Risk Data Library Standard (RDLS) is an open data standard to make it easier to work with disaster and climate risk data. It provides a common description of the data used and produced in risk assessments, including hazard, exposure, vulnerability, and modelled loss, or impact, data.
https://docs.riskdatalibrary.org/
Creative Commons Attribution Share Alike 4.0 International
13 stars 1 forks source link

[DATA] Inclusion of OIA-SEA datasets #33

Open matamadio opened 3 years ago

matamadio commented 3 years ago

Review of SEA DRIF datasets and assessment of procedure for inclusion in RDL

Web tool at https://tool.oi-analytics.com

immagine

Metrics presented

Overview

Suggested procedure for RDL ingestion (draft)

Outputs:

Team:

Estimated number of days and budget:

matamadio commented 3 years ago

Example Data package from Gordon

immagine

Sample details

Country

Myanmar

Global datasets

Fluvial and coastal flooding: WRI aqueduct Cyclones: STORM IBTrACS model Road links: OpenStreetMap + OSRM Railways tracks: OpenStreetMap Electricity transmission lines: Gridfinder Fragility: Koks et al. 2019; Miyamoto et al. 2019; Habermann and Hedel 2018 Socio-economic impacts: Grided Population Density/Count (WorldPop Population Data); Gridded GDP per capita (DRYAD Gridded GDP) Costs: Koks et al. (2019), World Bank ROCKS database; World Bank PPI database.

Data types

Data in 4 folders:

  1. Inputs
    • Adm units
    • OSM rails and roads
    • Tree cover
    • Electric Grid
    • Cyclone hazard
    • River Flood hazard (RP100 2050-RCP45)
    • Coastal Flood hazard (RP100 Historical)
    • Wind speed
    • Pollution (NOX emissions)
  2. Exposure (combination of inputs: hazard and asset at feature level)
    • Electricity
    • Rail
    • Road
  3. Risks (same as exposure, plus cost output - all attributes are maintained)
    • Electricity (cyclone only)
    • Rail (3 hazards)
    • Road (3 hazards)
  4. Summary (aggregation of Risks at ADM1 level)
    • Electricity (cyclone only)
    • Rail (3 hazards)
    • Road (3 hazards)

The columns listed below (and described above, in Risks) are also aggregated to the Admin1 level for 4. Summary. Depending on the type of data, the outputs may be per RP, epoch and scenario OR Annual (please see Risks, above, for details):


Geodata format review

MapInfo PROs:

MapInfo CONs:


Geodata content

Hazard

The full sets of hazards addresses:

Hazard Probabilities Intensities and spatial extents Climate scenario information
WRI Aqueduct flood hazard
  • Fluvial (river)
  • Coastal flooding with subsidence (median value)
1/2, 1/5, 1/10, 1/25, 1/50, 1/100, 1/250, 1/500, and 1/1000 Flood depths in meters over 30 arc second grid squares.
  • 1 historical and 5 future climate models
  • RCP 4.5 and 8.5 emission scenarios
  • Current and future maps in 2030, 2050, 2080
Cyclones from STORM IBTrACS model 28 different probabilities from 1/10 to 1/10000 3-hour time step wind gust speeds in m/s at 0.1-degree grid squares. None

Exposure

Generalised asset costs information are applied to estimate exposed value, as described in Table 3-3 of report. In some instances, they assumed a ±25% uncertainty in our cost estimations in line with the assumptions from Koks et al. (2019).

Vulnerability/Fragility

Figure 3-1 shows direct damage (fragility) curves for assets from different studies. Since having one fragility curve is not ideal for such a generalised context, Koks et al. (2019) suggested adding uncertainty to the fragility information and used five curves (derived from the original) to test the sensitivity of damage estimates to different fragility values immagine Figure 3-1: Generalised direct damage (fragility) curves vs flood depths for different types of infrastructure assets. (a) paved roads (from Koks et al. 20195); (b) unpaved road (from Koks et al. 20195); (c) railway lines (from Koks et al. 20195); and (d) power plants (from Miyamoto et al. 201943); (e) airports (from Habermann and Hedel 201844), (f) ports (based on expert judgment). The fragility curve for airports mainly represents flood damage to runways. The fragility curve for ports is based on expert input from a large port authority, details of which we cannot disclose. The boldest lines (State 1) are used in the original studies while the other curves (State 2 – State 5) are derived from the original curve by multiplying by 2, 3, 4, 5.

immagine Figure 3-2: Generalised damage probability (fragility) curves vs wind speeds for different types of assets.

Impacts and risk

The main outputs are presented as three metrics:

Important attributes:

  1. The probability of asset failure associated with the given exposure has been calculated, and added as a new attribute for each attribute representing hazard exposure. These attributes are the same as the exposure data, but have the word "pFail_" prepended (probabilities of failure are dimensionless):
    • pFail__RP for present-day, for example "pFail_cyclone_RP00090"
    • pFail_RP, for example "pFail_NS_2030_RP0002")
  2. In addition, the expected length of asset damaged by each event has been calculated, and is stored in a new attribute with the word "assetDamage_" prepended (expected lengths of asset damaged are stored in km):
    • assetDamage__RP for present-day, for example "assetDamage_cyclone_RP00090"
    • assetDamage_RP, for example "assetDamage_NS_2030_RP0002")
  3. The direct damage associated with the exposure (the cost of reinstating the predicted length of damaged infrastructure) has also been saved on the asset (stored in $M USD) and can be identified by "minEventCost" or "maxEventCost" prepended to the exposure hazard attribute and represent the minimum and maximum anticipated cost of works:
    • minEventCost__RP for present-day, for example "minEventCost_cyclone_RP00090"
    • maxEventCost__RP for present-day, for example "maxEventCost_cyclone_RP00090"
    • minEventCost_RP, for example "minEventCost_NS_2030_RP0002")
    • maxEventCost_RP, for example "maxEventCost_NS_2030_RP0002")
  4. The indirect cost (value of GDP flowing through the asset) associated with the event damage has also been saved on the asset which can be identified by the prepended term "gdp_" (and which are stored in $M USD / year):
    • gdp__RP for present-day, for example "gdp_cyclone_RP00090"
    • gdp_RP, for example "gdp_NS_2030_RP0002")
  5. By integrating over all RP in an epoch and scenario, the annual probability of asset failure which has been stored in a new attribute with "annualProbability_" prepended (which is a dimensionless probability of failure) and the RP value has been dropped from the attribute name:
    • annualProbability_ for present-day, for example "annualProbability_cyclone"
    • annualProbability, for example "annualProbability_NS_2030")
  6. Multiplying the annual probaiblity by the length of the asset, the expected annual length of asset failing under a nominated hazard (for a given epoch and scenario) yields the expected length of asset damaged (in km) and can be identified by the word "expectedLengthDamaged_":
    • expectedLengthDamaged_ for present-day exposures, for example "expectedLengthDamaged_cyclone"
    • expectedLengthDamaged_, for example "expectedLengthDamaged_NS_2030")
  7. The expected annual direct damages (EAD) associated with a combination of hazard, epoch and scenario have been calculated for every asset. These values - identified by the prepended word "minEAD" or "maxEAD" representing the range of expected annual damage associated with the hazard, for the nominated epoch and scenario, and is stored in $M USD / year:
    • minEAD_ for present-day exposures, for example "minEAD_cyclone"
    • maxEAD_ for present-day exposures, for example "maxEAD_cyclone"
    • minEAD, for example "minEAD_NS_2030")
    • maxEAD, for example "maxEAD_NS_2030")
  8. The indirect losses associated with the predicted annual damage to every asset has been calculated. These values - identified by the prepended word "EAEL-gdp_" representing the anticipated loss to GDP ($M USD / year) associated with the hazard, for the nominated epoch and scenario:
    • EAEL-gdp_ for present-day exposures, for example "EAEL-gdp_cyclone"
    • EAEL-gdp, for example "EAEL-gdp_NS_2030")
  9. Additionally, for road and rail assets it has been possible to estimate the split between primary, secondary and tertiary contribution to GDP in addition to "gdp" as described in 4 and 8. Where these data are available, additional attributes have been stored in the risk datasets which are identified by the prepended words "primary-gdp", "secondary-gdp", "tertiary-gdp" for individual events and by "EAEL-primary-gdp", EAEL-primary-gdp", and "EAEL-primary-gdp" for annual indirect losses. All data are stored in $M USD / year:
    • tertiary-gdp__RP for present-day, for example "tertiary-gdp_cyclone_RP00090"
    • secondary-gdp_RP, for example "secondary-gdp_NS_2030_RP0002")
    • EAEL-primary-gdp_ for present-day exposures, for example "EAEL-primary-gdp_cyclone"
    • EAEL-tertiary-gdp, for example "EAEL-tertiary-gdp_NS_2030")



Actions to host datasets into RDL

Align metadata to schema

Choose storage/distribution format

There are a number of alternative options that reflect different a) curation effort, b) friction for the user, c) compatibility with online tool and d) storage size criteria. A lot depends on the requirements of the analytics webtool, and if it will be discontinued. Please also note that the webtool does not use/show all the attributes in the datasets, only a few.

Option Curation effort Friction for user Compatibility with Webtool Storage size
Option 1 Low Worst Best Bad
Option 2 High Best Best Worst
Option 3 High + Dev Best Best Best

Option 1

Option 2

Option 3


Value for RDL

Time and feasibility

Depends on the chosen curation/storage option. The metadata aligment step requires the same time for all three, quantified in 4-5 full days. The data curation effort varies a lot.

matamadio commented 3 years ago

So the final size of optimised data could be around 1/5 of original data, relevant for Option 2 and 3.

matamadio commented 3 years ago

As Pierre confirmed that the online tool will be discontinued, we just focus our effort on using the data for RDL services, without need to link them to the current web application. Then the option is simply:

This should be done by mid June, but depends on how fast they provide the rest of the data, and how fast we solve any arising conversion issues.

matamadio commented 3 years ago

Following up conversation with @stufraser1 and @jeanpommier: