gem / oq-engine

OpenQuake Engine: a software for Seismic Hazard and Risk Analysis
https://github.com/gem/oq-engine/#openquake-engine
GNU Affero General Public License v3.0
387 stars 277 forks source link

Please describe the "correct" way to call the GMPEs from now on (add doc page if necessary) #7167

Closed g-weatherill closed 3 years ago

g-weatherill commented 3 years ago

In light of the changes to the GMPE API in the hazardlib (https://github.com/gem/oq-engine/blob/master/doc/breaking-hazardlib.md) the previous way of calling GMPEs directly from the class (i.e. using the get_mean_and_stddevs(...) function with the SitesContext, RuptureContext and DistancesContext objects as arguments) is largely deprecated. In my experience, this workflow is actually now broken.

To my knowledge there are several active projects that are using the GMPEs directly (in the previous manner), along with many individual users who have developed their own code over the years. Most notable among those affected have been the USGS Shakemap project (see here: https://github.com/gem/oq-engine/issues/7018), and there are several other related shakemap projects around the globe (some of whom have developed their own code rather than the USGS Shakemap library).

At present, the strong motion toolkit is also incompatible with the new library, and will require efforts to migrate it. Other scientific users have indicated (to me at least) that they use the GMPE library for comparing against ground motion observations (from databases, flatfiles etc.). These use cases are important because here the starting point is not necessarily a Rupture (they may not have one for a given event) but usually a table describing the attributes of the earthquake. This has meant that the use of the three individual context objects grouping together the site, distance and rupture attributes have been more convenient, practical and faster than the use of the ContextMaker. This is particularly true as recent changes to the ContextMaker now require information from the job.ini (i.e. the IMTLs) to be passed in order to build the contexts. This can be hacked using dummy variables, but this is not clear to the users nor is it documented anywhere.

To facilitate compatibility between code written using the old style and the GMPEs implemented in the new style I have adopted the following hack, which I'm sharing below for reference (for users who have emailed me directly on this rather than the mailing list). But there is no guarantee this will work in the future, and (again) it is hacky:

import numpy as np
from openquake.hazardlib.gsim.base import SitesContext, DistancesContext, RuptureContext

class MultiContext():
    """
    Object to facilitate compatibility between old and new style GMPEs

    Usage:
    # Set the context objects as normal
    rctx = RuptureContext()
    rctx. ... (add the rupture attributes as normal)

    dctx = DistancesContext()
    dctx. ... (add the distance attributes as normal)

    sctx = SitesContext()
    sctx. ... (add the site attributes as normal

    ctx = MultiContext(sctx, rctx, dctx)

    [mean, stddevs] = gmpe.get_mean_and_stddevs(*list(ctx), imt, stddev_types)
    """

    _slots_ = tuple(SitesContext()._slots_ +
                    list(DistancesContext()._slots_) +
                    list(RuptureContext()._slots_) +
                    # Add missing attributes not found in the sites slots
                    ["fpeak", "backarc", "xvf", "sids"])

    def __init__(self, sctx, rctx, dctx):  
        for slot in self._slots_:
            for obj in [sctx, rctx, dctx]:
                if hasattr(obj, slot):
                    setattr(self, slot, getattr(obj, slot))
        # For the rupture context objects need to turn a set
        # of scalar values into arrays of equal length to those
        # in the site and and distance context objects
        nval = len(self)
        for slot in rctx._slots_:
            if hasattr(rctx, slot):
                setattr(self, slot, getattr(rctx, slot) + np.zeros(nval))
        if "sids" not in dir(sctx):
            # Is SIDS are not found in the sites context object then add
            # these in
            self.sids = np.arange(0, nval)

    def __len__(self):
        # Returns the number of elements from the vector attributes
        # raising an error if any are mismatched
        n = 0
        for slot in self._slots_:
            if hasattr(self, slot) and isinstance(getattr(self, slot), np.ndarray):
                nattr = len(getattr(self, slot))
                if n and nattr != n:
                    raise ValueError("Mismatched number of elements")
                n = nattr
        return n

    def __iter__(self):
        # To input into existing models, yield the object three times
        # e.g. list(ctx) = [ctx, ctx, ctx]
        for i in range(3):
            yield self

As many users including myself will need to migrate their code I recommend providing clear example documentation to illustrate the preferred way to call the GMPEs in future. Please also try to address the use case that the user may not have a rupture in future events and may wish to define the contexts directly from attributes (as has been done before).

mmpagani commented 3 years ago

Graeme, thanks for this. We will provide a more detailed answer to this in due course. For the time being, I would encourage users to avoid sending personal emails and instead write to the OQ Engine mailing list at https://groups.google.com/g/openquake-users?pli=1 or to open an issue here https://github.com/gem/oq-engine/issues

micheles commented 3 years ago

Sorry Grame, you should be more specific, I do not understand exactly what is broken. After the feedback from @cbworden the compatibility issues with the old contexts have been solved, at least to my knowledge. The gist of the fix were the following lines of code in hazardlib.gsim.base, in the function get_mean_and_stddevs:

        if sites is not rup or dists is not rup:
            # convert three old-style contexts to a single new-style context
            ctx = full_context(sites, rup, dists)
        else:
            ctx = rup  # rup is already a good object

The function full_contexts, converting three old-style contexts into a single new-style context, essentially corresponds to your hack. After this fix, implemented one month ago or more, get_mean_and_stddevs is supposed to work with old-style contexts. While ideally deprecated, the method get_mean_and_stddevs will never be removed and will never raise deprecation warnings because I know that backward compatibility is important.

The recommended way to work with GMPEs via the ContextMaker is already documented in the advanced manual:

https://github.com/gem/oq-engine/blob/master/doc/adv-manual/developing.rst#working-with-gmpes-directly-the-contextmaker

The ContextMaker requires passing an imtls dictionary, but passing an empty dictionary is supported and works in the case when you do not need to compute hazard curves and you do not have the imtls handy. I preferred to pass an empty dictionary explicitly rather than using an empty dictionary as default, since it looks clearer to me ("explicit is better than implicit") but that can be negotiated.

So, let me know what is still broken (I fixed all the SMTK tests that were red weeks ago) and I will fix it.

micheles commented 3 years ago

After 11 days I see no further feedback, so I am closing this. Feel free to reopen if needed.

cbworden commented 3 years ago

I'll just say that the change came as a shock after so many years of stability in the interface and then things were suddenly broken. I was fairly lost at first, but did eventually figure out how to make things work. I think we all then ironed out backwards compatibility, but I don't use it -- I use my own function to combine the contexts into a single context before calling get_mean_and_stddevs(). For ShakeMap, we're now tying our builds to specific versions/commits of OQ engine, and will try to keep a closer eye out for any interface changes in the future. We were fairly casual in the past by building off of the master OQ repo so that we could just let users take in any new GMMs that might be defined, but we will be more cautious in the future.

micheles commented 3 years ago

Relying directly on hazardlib master was always dangerous. It is true that breaking changes are rare (the current record is one in 10 years) but there are relatively often insidious changes, such as a bug-fix changing the results. For instance, recently we fixed the AbrahamsonEtAl2014 GMPE; if you rely on master, it means that from one day to the other, your users will see different numbers for any computation involving the AbrahamsonEtAl2014, without knowing what caused the difference. Instead, if you use an official release of the engine, the change in the numbers will not happen randomly the day we merge the bug fix, but only when you decide to make a new release of the ShakeMaps based on a new version of the hazardlib. Moreover in case of doubts you can look at the "What's New" document and see what changed.

hhalaby commented 1 year ago

For info, the link for the recommended way to work with GMPEs via the ContextMaker is now this one: https://github.com/gem/oq-engine/blob/master/doc/adv-manual/general.rst#working-with-gmpes-directly-the-contextmaker