dtopotools: Fault and SubFault classes and Mw functions

@mandli: I'm working on figuring out the new structure of dtopotools.py. It appears that the SubFault class is for representing a single subfault while the Fault class is for a fault that might be composed of many subfaults? But then why is SubFault a subclass of Fault? This seems to imply that a SubFault might itself have a list self.subfaults, which doesn't seem to make sense. I don't see where this subclassing is really needed, so should it just be

    class SubFault(object):

I'm trying to clean up the definitions of Mw. I suggest the following to minimize the number of places where formulas or unit conversions appear. The formula for Mw is defined once and then used in both classes. Or maybe we don't need to define Mw for each class and can always just use this function directly when needed? Maybe you have other ideas about how the units of Mo should be specified too.

def Mw(Mo, units="N-m"):
    """ 
    Calculate moment magnitude based on seismic moment Mo.
    Follows USGS recommended definition from 
        http://earthquake.usgs.gov/aboutus/docs/020204mag_policy.php
    """

    if units == "N-m":
        Mw = 2/3.0 * (numpy.log10(self.Mo()) - 9.05)
    elif units == "dyne-cm":
        Mw = 2/3.0 * numpy.log10(self.Mo()) - 10.7
        #  = 2/3.0 * (numpy.log10(1e-7 * self.Mo()) - 9.05)
    else:
        raise ValueError("Unknown unit for Mo: %s." % units)

    return Mw

class Fault...

    def Mo(self):
        r""" 
        Calculate the seismic moment for a fault composed of subfaults,
        in units N-m.
        """

        total_Mo = 0.0
        for subfault in self.subfaults:
            total_Mo += subfault.Mo()
        return Mo

    def Mw(self):
        r""" Calculate the moment magnitude for a fault composed of subfaults."""
        return Mw(self.Mo())

class SubFault...

    def Mo(self):
        r""" Calculate the seismic moment for a single subfault, in units N-m. """

        # Convert units of rigidity mu to Pascals 
        # (1 Pa = 1 Newton / m^2 = 10 dyne / cm^2)
        if self.units["mu"] == "Pa":
            mu = self.mu
        if self.units["mu"] == "GPa":
            # e.g. mu = 40 GPa
            mu = 1e9 * self.mu
        if self.units["mu"] == "dyne/cm^2":
            # e.g. mu = 4e11 dyne/cm^2
            mu = 0.1 * self.mu
        elif self.units["mu"] == 'dyne/m^2':
            # Does anyone use this unit?  
            mu = 1e-5 * self.mu 
        else:
            raise ValueError("Unknown unit for rigidity %s." % self.units['mu'])

        dimensions, slip = self.convert2meters(["dimensions", "slip"])
        total_slip = dimensions[0] * dimensions[1] * slip
        Mo = mu * total_slip
        return Mo

    def Mw(self):
        r""" Calculate the moment magnitude for a single subfault."""
        return Mw(self.Mo())

Thanks for taking a look at this, some responses are below.

Class Structure

I think I originally was thinking that the SubFault class would inherit a lot more functionality from the Fault class but as I was implementing the functionality I realized there were spots where they had to differ enough that a complete replacement for some of the functions was needed. Since they still do share some functionality I would prefer to keep them like this but perhaps another parent for both would be more appropriate, maybe like BaseFault which both Fault and SubFault along with all of the other Fault sub-classes could inherit from. This would clear up the intention of the inheritance a lot probably. We could also add a leading underscore _BaseFault to indicate that one should not use the parent class.

`Mw` Definitions

I am always up for reduction of code replication and I like the way you did that in the above. If we retain the class inheritance the Mw function could be put in the base class and Mo can be defined in each of the sub-classes.

Having a BaseFault class might be a good idea.

I'm testing the new dtopotools routines to read subfaults and write a dtopo file. I had a hard time getting it to write a file with the set of x-y values I want (to compare with old results), but got it to work with something like this (using ddde94086e):

dtopo = dtopotools.UCSBFault()
dtopo.read('tohoku-ucsb.txt')
mx = int((xupper - xlower)*60 + 1)
my = int((yupper - ylower)*60 + 1)
dtopo.x = linspace(xlower,xupper,mx)
dtopo.y = linspace(ylower,yupper,my)
dtopo.delta = 1./60.    # not set since create_coordinate_arrays not called
dtopo.create_deformation_array()
dtopo.t = array([0., 1.])       # default is [0, 5, 10]:  why?
dtopo.write('tohoku-ucsb.tt3',3)

Some things I noticed...

Why is there

    def create_coordinate_arrays

in both classes Fault and SubFault? This creates the x,y vectors for the dtopo file to be written out, which generally is for a Fault, not an individual subfault.

In this routine, I'm not sure it's a good idea to specify resolution in terms of points per degree desired. What if you want the resolution to be 2 degrees? Maybe resolution=0.5 works although the docstring says it's an int.

Also I think you need to set

        N = [int((rect[1] - rect[0]) * resolution) + 1,
             int((rect[3] - rect[2]) * resolution) + 1]

to get the desired resolution, not

        N = [int((rect[1] - rect[0]) * resolution),
             int((rect[3] - rect[2]) * resolution)]

More generally there should be an easy way to specify arbitrary x,y vectors for the desired dtopo file -- it's not necessarily true that the same size buffer zone on each side of the fault is desired, although this is a reasonable default.

I guess you can set the .x and .y attributes of the Fault explicitly, as I did in the code above, but as currently implemented, you then also have to set the .delta attribute in order to use .write().

I'm not sure I like having the routines to read a dtopo file and plot dtopo files being part of the Fault class. There are situations where the dtopo is specified directly without having any fault information (e.g. from a tsunami inversion) and the dtopo might not even correspond to an earthquake (e.g. from a landslide) so it seems odd to have to instantiate a Fault object to read it in and plot it. Should we have a DTopography class for managing dtopo objects that is more independent of Fault? I'd suggest that a Fault object should have a dtopo attribute that is of this class.

I'm willing to take a pass at reorganizing classes once we've converged on a plan.

Having a BaseFault class might be a good idea.

We should put this on the docket for "things needing to be done" along with some of the other restructures you are suggesting.

I'm testing the new dtopotools routines to read subfaults and write a dtopo file. I had a hard time getting it to write a file with the set of x-y values I want (to compare with old results), but got it to work with something like this (using ddde94086e):
dtopo = dtopotools.UCSBFault()
dtopo.read('tohoku-ucsb.txt')
mx = int((xupper - xlower)*60 + 1)
my = int((yupper - ylower)*60 + 1)
dtopo.x = linspace(xlower,xupper,mx)
dtopo.y = linspace(ylower,yupper,my)
dtopo.delta = 1./60.    # not set since create_coordinate_arrays not called
dtopo.create_deformation_array()
dtopo.t = array([0., 1.])       # default is [0, 5, 10]:  why?
dtopo.write('tohoku-ucsb.tt3',3)

I am not sure why I set dtopo.t = [0, 5, 10] originally. I think this might have again been a hold over from something I was testing and we should modify it to the more instantaneous values of [0, 1].

Some things I noticed...

Why is there
    def create_coordinate_arrays
in both classes Fault and SubFault? This creates the x,y vectors for the dtopo file to be written out, which generally is for a Fault, not an individual subfault.

I wanted to have the ability to have a single SubFault object be enough to actually create a dtopo file as well as plot the deformations. This mainly was because at the time I was playing with single sub-fault characterizations for some inversion testing and lead to a divergence on how both classes calculate this. In actuality I think I wrote the SubFault class create_coordinate_arrays function before the more general Fault class function so it would be worth a look as to how we could eliminate one or the other.

In this routine, I'm not sure it's a good idea to specify resolution in terms of points per degree desired. What if you want the resolution to be 2 degrees? Maybe resolution=0.5 works although the docstring says it's an int.

That is rather counter intuitive, maybe specifying the delta directly is better here?

Also I think you need to set

        N = [int((rect[1] - rect[0]) * resolution) + 1,
             int((rect[3] - rect[2]) * resolution) + 1]

to get the desired resolution, not

        N = [int((rect[1] - rect[0]) * resolution),
             int((rect[3] - rect[2]) * resolution)]

Always screw that up.

More generally there should be an easy way to specify arbitrary x,y vectors for the desired dtopo file -- it's not necessarily true that the same size buffer zone on each side of the fault is desired, although this is a reasonable default.

I guess you can set the .x and .y attributes of the Fault explicitly, as I did in the code above, but as currently implemented, you then also have to set the .delta attribute in order to use .write().

I am surprised that that happened, the intention was that delta should be computed as needed if not already provided. Same goes for the calling of the create_deformation_array routine, the intention is that if the user calls write after specifying the x and y coordinate arrays everything should be automatically created as needed. For instance your example script should be able to be shortened to

dtopo = dtopotools.UCSBFault('tohoku-ucsb.txt')
mx = int((xupper - xlower)*60 + 1)
my = int((yupper - ylower)*60 + 1)
dtopo.x = linspace(xlower,xupper,mx)
dtopo.y = linspace(ylower,yupper,my)
dtopo.t = array([0., 1.])
dtopo.write('tohoku-ucsb.tt3',3)

so basically only the things you care about touching are touched. If this is not the case then this is a bug.

I'm not sure I like having the routines to read a dtopo file and plot dtopo files being part of the Fault class. There are situations where the dtopo is specified directly without having any fault information (e.g. from a tsunami inversion) and the dtopo might not even correspond to an earthquake (e.g. from a landslide) so it seems odd to have to instantiate a Fault object to read it in and plot it. Should we have a DTopography class for managing dtopo objects that is more independent of Fault? I'd suggest that a Fault object should have a dtopo attribute that is of this class.

That's not a bad idea (and was actually the original name of the class). This would separate out the functionality pretty well (reading, writing, and plotting) from the fault specific functionality (dealing with sub-fault specifications, magnitudes, etc.).

I'm willing to take a pass at reorganizing classes once we've converged on a plan.

At this point I think it might be worth coming up with a list of changes/additions that need to be done and maybe talking through them and heading forward?

On further reflection, I'm not sure we need a BaseFault class. It seems to me a fault should always consist of a set of subfaults, perhaps only a single subfault in the set for some examples. But functions like read should be needed only for the Fault class and should read a subfaults file to define the set of SubFault objects associated with this fault.

The create_deformation_array method of a Fault object should loop over the subfaults, applying Okada to each subfault and summing up the results, using the same (x,y) grid for each. This should create a DTopography object with attributes dtopo.x, dtopo.y, dtopo.z_list (note that for time dependent faults, we need a list of deformations). This suggests a Fault object might have a dtopo attribute to store this. But I'm undecided about whether SubFault should also have a dtopo attribute. It makes sense to, since Okada is applied to each subfault separately before summing up, and for a time-dependent fault these need to be computed for each subfault and then at each time some fraction of each must be used in the linear combination. The argument against this is that the resulting z array depends on x,y and we'd need to insure that each subfault's dtopo object has the same x,y. I guess this is easy to do if the create_deformation_array method takes x,y as arguments and then starts by initializing each subfault's dtopo.x and dtopo.y correspondingly before calling Okada.

I don't think Fault should have a read_dtopo method, since dtopo can be generated by a fault but it makes no sense to read other dtopo into the Fault object. Instead DTopography should have a read method for reading in dtopo files.

SubFault and Fault might both have an Mo attribute (or method) since the seismic moment is defined for each subfault and then summed to get Mo for the full fault. But I don't think it makes sense for each subfault to have an Mw -- the moment magnitude of the quake depends on all the subfaults.

clawpack / geoclaw