Open micheles opened 1 year ago
Michele, please see the attached CSV to help you map between the GRM repos and the hazard mosaic repos. I also included some comments on 'exceptions' to the general case.
If you clone the relevant risk region repo (e.g., global_riskmodel/Africa) and --recurse-submodules
/ update the submodules, you should have all the dependencies (hazard, exposure, vulnerability) on the appropriate versions. The job.ini file will list all the specific paths you need for the gmmLT, vulnerability curves, etc. The Exposure
The current status of the risk repos on cole/davis is unknown, since we have not run the GRM since June and individual modellers may make changes to those files, some repos may only be partially cloned (without submodules) after some server/cluster modifications, etc.
Currently the idea is to build a few HDF5 files at each new release of the mosaic:
utils/build_global_sites
)utils/build_global_exposure
)Then the Aristotle calculator will be able to extract from such files the relevant information quickly.
A further crucial feature will be the inclusion of recording station data for ground motion conditioning, if such station data is already available at the time of launching an Aristotle calculation.
Given a USGS ShakeMap id, the station data curated by the USGS for the event can be found in the associated stationlist.json file, for instance https://earthquake.usgs.gov/product/shakemap/us7000m9g4/us/1715297585708/download/stationlist.json for the 2024 M7.4 Hualien earthquake earlier this year in Taiwan. INGV uses an identical format for their station data file, for instance http://shakemap.rm.ingv.it/shake4/data/8863681/current/products/stationlist.json for the 2016 M6.5 Norcia earthquake. Documentation of the stationlist.json file format is available at https://usgs.github.io/shakemap/manual4_0/ug_products.html#stationlist-geojson.
The json file would need to be parsed, checked for duplicate entries and outliers, and converted to the csv format accepted by the OpenQuake engine (or directly to the internal dataframe format used by the engine after reading the csv station data input file).
The station data file can contain two kinds of stations – 'seismic' stations and 'macroseismic' stations. Seismic stations report the ground motions recorded by instruments, whereas macroseismic stations might report intensity values inferred from observed damage patterns for historical earthquakes or inferred from felt reports for recent earthquakes. For this implementation, only the seismic stations should be considered. All available IMTs relevant for the risk calculations should be read from the station data file – typically available IMTs might include PGA, SA(0.3), SA(1.0), and SA(3.0).
A site model would also need to be generated for the station sites. If the Vs30 values at the locations of the stations are already available through the stationlist.json file, those can be used directly, otherwise the Vs30 values for the stations would need to be extracted from the global vs30 hdf5 file. If any other site parameters other than Vs30 are required by the ground motion models that will be used in the calculation, those additional site parameters will also need to be included in the station site model file.
Once we have these two new inputs (the station data file or dataframe, and the station site model), Aristotle should run the requested scenario with the conditioned_gmfs calculator.
Points from 2024-07-24 meeting:
[KothaEtAl2020ESHM20SlopeGeology]
sigma_mu_epsilon = -2.85697000
c3_epsilon = -1.73205100
We need to expose the time_event
parameter through the webui (converting UTC to the local time), with the possibility to override it.
the USGS rupture.json file contains the UTC timestamp: https://earthquake.usgs.gov/realtime/product/shakemap/us6000n8tq/us/1721368807663/download/rupture.json but the finite fault file may or may not contain the time: https://earthquake.usgs.gov/realtime/product/finite-fault/us6000n8tq_1/us/1719609323516/shakemap_polygon.txt and python’s time module can convert from UTC to local time: https://docs.python.org/3.11/library/time.html#time.localtime
Here we can find some code to convert the USGS stationlist.json to a csv file in a format compatible with OQ: https://github.com/gem/earthquake-scenarios/blob/main/src/1_1_stations_usgs_json_to_csv.ipynb
Testing the service on recent earthquakes, we noticed that in most cases the USGS provides very limited information right after the event and for the next few days, so in most cases we can't rely on shakemap or finite-fault for quick responses. We need to collect some statistics about the USGS policies making data available after the events and figure out proper strategies to run calculations with different sets of data at different time deltas after an event.
Given an USGS Shakemap ID, get the corresponding rupture, exposure, vulnerability functions and GMPEs and perform a risk calculation (see https://docs.google.com/document/d/1mS2S7yOohiJjiEqL85E65k2XbOj7xEAuUo4NaAFHuco).
Geolocation by country can be done via the files in https://www.geoboundaries.org/globalDownloads.html
The difficulty here is to collect the world exposure, world vulnerability functions/taxonomy mappings and world GMPEs from dozen of repositories and fix all the inconsistencies. Here is a list of inconsistencies:
/home/risk/global_risk_model/North_America
contains an empty directory Exposure/Exposure/Disaggregated/ differently from other regions<field oq="residents" input="OCCUPANTS_PER_ASSET" />
however in the CSV files the name isOCCUPANTS_PER_ASSET_AVERAGE
.NB: the risk regions are
USGS ruptures (like https://earthquake.usgs.gov/product/shakemap/us70006sj8/atlas/1594403794805/download/rupture.json) have the format
rupture_dict
rupture_dict
parameter to the job.inirupture_dict
by using the code in IPTSometimes the USGS also gives .json files with geometries that can be converted to OpenQuake ruptures as in this notebook: https://github.com/gem/earthquake-scenarios/blob/main/src/2_1_rupture_usgs_json_to_oq_xml.ipynb
We also need
See also https://gempad.openquake.org/p/2023-12-21-Aristotle-xkddghsdg9876hf.
The taxonomy mapping per country can be extracted from here: https://gitlab.openquake.org/risk/global_risk_model/Scripts/-/blob/master/grm_calculations/job_files.csv
NB: the taxonomy mapping is a HUGE problem. Currently the engine cannot manage the case of two assets of the same taxonomy being mapped to different vulnerability functions because they belong to different countries. The taxonomy mapping is global, while we would need to make it country-dependent. Also, splitting the exposure in countries and perform multiple calculations is a solution only in theory, since it makes everything more complex and much slower. We will probably have to rewrite completely the risk calculators (for instance the RiskComputer assumes assets with the same taxonomy are associated to the same risk functions), which is hard :-(