gem / oq-engine

OpenQuake's Engine for Seismic Hazard and Risk Analysis
https://github.com/gem/oq-engine/#openquake-engine
GNU Affero General Public License v3.0
373 stars 272 forks source link

Aristotle Project #9227

Open micheles opened 7 months ago

micheles commented 7 months ago

Given an USGS Shakemap ID, get the corresponding rupture, exposure, vulnerability functions and GMPEs and perform a risk calculation (see https://docs.google.com/document/d/1mS2S7yOohiJjiEqL85E65k2XbOj7xEAuUo4NaAFHuco).

Geolocation by country can be done via the files in https://www.geoboundaries.org/globalDownloads.html

The difficulty here is to collect the world exposure, world vulnerability functions/taxonomy mappings and world GMPEs from dozen of repositories and fix all the inconsistencies. Here is a list of inconsistencies:

NB: the risk regions are

Africa
Caribbean_Central_America
Central_Asia
East_Asia
Europe
Middle_East
North_America
North_Asia
Oceania
South_America
South_Asia
Southeast_Asia

USGS ruptures (like https://earthquake.usgs.gov/product/shakemap/us70006sj8/atlas/1594403794805/download/rupture.json) have the format

{
  "type": "FeatureCollection",
  "metadata": {
    "reference": "Origin",
    "id": "us70006sj8",
    "network": "USGS National Earthquake Information Center, PDE",
    "netid": "us",
    "productcode": "us70006sj8",
    "time": "2019-12-30T17:18:57.000000Z",
    "lat": 35.5909,
    "lon": 74.6280,
    "depth": 13.8,
    "mag": 5.6,
    "locstring": "34km NW of Idgah, Pakistan",
    "mech": "ALL",
    "rake": 0
  },
  "features": [
    {
      "type": "Feature",
      "properties": {
        "rupture type": "rupture extent"
      },
      "geometry": {
        "type": "Point",
        "coordinates": [ 74.6280, 35.5909, 13.8 ]
      }
    }
  ]
}

Sometimes the USGS also gives .json files with geometries that can be converted to OpenQuake ruptures as in this notebook: https://github.com/gem/earthquake-scenarios/blob/main/src/2_1_rupture_usgs_json_to_oq_xml.ipynb

We also need

See also https://gempad.openquake.org/p/2023-12-21-Aristotle-xkddghsdg9876hf.

The taxonomy mapping per country can be extracted from here: https://gitlab.openquake.org/risk/global_risk_model/Scripts/-/blob/master/grm_calculations/job_files.csv

NB: the taxonomy mapping is a HUGE problem. Currently the engine cannot manage the case of two assets of the same taxonomy being mapped to different vulnerability functions because they belong to different countries. The taxonomy mapping is global, while we would need to make it country-dependent. Also, splitting the exposure in countries and perform multiple calculations is a solution only in theory, since it makes everything more complex and much slower. We will probably have to rewrite completely the risk calculators (for instance the RiskComputer assumes assets with the same taxonomy are associated to the same risk functions), which is hard :-(

nicolepaul commented 7 months ago

Michele, please see the attached CSV to help you map between the GRM repos and the hazard mosaic repos. I also included some comments on 'exceptions' to the general case.

If you clone the relevant risk region repo (e.g., global_riskmodel/Africa) and --recurse-submodules / update the submodules, you should have all the dependencies (hazard, exposure, vulnerability) on the appropriate versions. The job.ini file will list all the specific paths you need for the gmmLT, vulnerability curves, etc. The Exposure.xml file indicates whether to use the aggregated or the disaggregated exposure.

The current status of the risk repos on cole/davis is unknown, since we have not run the GRM since June and individual modellers may make changes to those files, some repos may only be partially cloned (without submodules) after some server/cluster modifications, etc.

GRM_Mosaic_Map.csv

micheles commented 6 months ago

Currently the idea is to build a few HDF5 files at each new release of the mosaic:

Then the Aristotle calculator will be able to extract from such files the relevant information quickly.

raoanirudh commented 1 month ago

A further crucial feature will be the inclusion of recording station data for ground motion conditioning, if such station data is already available at the time of launching an Aristotle calculation.

Given a USGS ShakeMap id, the station data curated by the USGS for the event can be found in the associated stationlist.json file, for instance https://earthquake.usgs.gov/product/shakemap/us7000m9g4/us/1715297585708/download/stationlist.json for the 2024 M7.4 Hualien earthquake earlier this year in Taiwan. INGV uses an identical format for their station data file, for instance http://shakemap.rm.ingv.it/shake4/data/8863681/current/products/stationlist.json for the 2016 M6.5 Norcia earthquake. Documentation of the stationlist.json file format is available at https://usgs.github.io/shakemap/manual4_0/ug_products.html#stationlist-geojson.

The json file would need to be parsed, checked for duplicate entries and outliers, and converted to the csv format accepted by the OpenQuake engine (or directly to the internal dataframe format used by the engine after reading the csv station data input file).

The station data file can contain two kinds of stations – 'seismic' stations and 'macroseismic' stations. Seismic stations report the ground motions recorded by instruments, whereas macroseismic stations might report intensity values inferred from observed damage patterns for historical earthquakes or inferred from felt reports for recent earthquakes. For this implementation, only the seismic stations should be considered. All available IMTs relevant for the risk calculations should be read from the station data file – typically available IMTs might include PGA, SA(0.3), SA(1.0), and SA(3.0).

A site model would also need to be generated for the station sites. If the Vs30 values at the locations of the stations are already available through the stationlist.json file, those can be used directly, otherwise the Vs30 values for the stations would need to be extracted from the global vs30 hdf5 file. If any other site parameters other than Vs30 are required by the ground motion models that will be used in the calculation, those additional site parameters will also need to be included in the station site model file.

Once we have these two new inputs (the station data file or dataframe, and the station site model), Aristotle should run the requested scenario with the conditioned_gmfs calculator.