gem / oq-engine

OpenQuake Engine: a software for Seismic Hazard and Risk Analysis
https://github.com/gem/oq-engine/#openquake-engine
GNU Affero General Public License v3.0
377 stars 273 forks source link

Improve error messages for scenario calculations with conditioning #8930

Closed CatalinaYepes closed 1 year ago

CatalinaYepes commented 1 year ago

Two error messages to improve:

1. Missing vs30 values at the location of the observations

When missing vs30 values at the location of the observations, the following error message appears (see also the question in forum) :

--> 494         readinput.get_station_data(oq, self.sitecol)

...

File ~/oq-engine/openquake/commonlib/readinput.py:964, in (.0)
    961 lats = numpy.round(df['LATITUDE'].to_numpy(), 5)
    962 sid = {(lon, lat): sid
    963        for lon, lat, sid in sitecol[['lon', 'lat', 'sids']]}
--> 964 sids = U32([sid[lon, lat] for lon, lat in zip(lons, lats)])
    966 # Identify the columns with IM values
    967 # Replace replace() with removesuffix() for pandas ≥ 1.4
    968 imt_candidates = df.filter(regex="_VALUE$").columns.str.replace(
    969     "_VALUE", "")

KeyError: (85.27259, 27.68216)

Possible error message:

Vs30 values are not available at the station locations. You can provide to site models, e.g.:
site_model_file = site_model.csv site_model_stations.csv

2. Input files with different columns

We need to raise an error when files don't have the same fields. In my case, the following error appears when indicating in the job file the following site model: site_model_file = Site_model.csv site_model_stations.csv

The field custom_site_id was present only in the site_model_stations.csv file. This error should also be raised for other type of files, like exposure modes.

File [~/oq-engine/openquake/commonlib/readinput.py:530](https://file+.vscode-resource.vscode-cdn.net/Users/cye/Documents/global_risk_model/Scenarios/~/oq-engine/openquake/commonlib/readinput.py:530), in get_site_model(oqparam)
    527         raise InvalidFile('There are duplicated sites in %s:\n%s' %
    528                           (fname, dupl))
    529     arrays.append(sm)
--> 530 return numpy.concatenate(arrays)

File <__array_function__ internals>:180, in concatenate(*args, **kwargs)

File [~/openquake/lib/python3.9/site-packages/numpy/core/_internal.py:458](https://file+.vscode-resource.vscode-cdn.net/Users/cye/Documents/global_risk_model/Scenarios/~/openquake/lib/python3.9/site-packages/numpy/core/_internal.py:458), in _promote_fields(dt1, dt2)
    456 # Both must be structured and have the same names in the same order
    457 if (dt1.names is None or dt2.names is None) or dt1.names != dt2.names:
--> 458     raise TypeError("invalid type promotion")
    460 # if both are identical, we can (maybe!) just return the same dtype.
    461 identical = dt1 is dt2

TypeError: invalid type promotion
CatalinaYepes commented 1 year ago

The error message indicated in number 1 is a bug. Even after providing vs30 values for observation the same error message appears.

Debugging, I found out that the error is only triggered when running scenario damage or risk calculations. The same input files can be used for estimating gmfs without having the error.

CatalinaYepes commented 1 year ago

During the association of the list of assets by site to the site collection (done with function assoc2), some stations are currently filtered out.

We need to consider for the association the locations from the asset + the ones in the station_data_file (within the maximum_distance).

micheles commented 1 year ago

This is not a wrong error message. The problem is much more serious: the site collection does not contain the stations. So we fneed to extend the site collection with the station sites (see https://github.com/gem/oq-engine/pull/8949). I also believe the maximum_distance is a red herring, i.e. it is okay as it is.