nanograv / enterprise

ENTERPRISE (Enhanced Numerical Toolbox Enabling a Robust PulsaR Inference SuitE) is a pulsar timing analysis code, aimed at noise analysis, gravitational-wave searches, and timing model analysis.
https://enterprise.readthedocs.io
MIT License
65 stars 67 forks source link

How to do generic Fourier signals? #93

Closed jellis18 closed 6 years ago

jellis18 commented 7 years ago

We already have the ability go pretty much use whatever spectrum we want through the FourierBasisGP class factory and we can do lots of different selections. However, right now the basis is fixed (i.e. for something like DM we would need a completely new signal factory or at least a lot of copied code).

I've been working on a few things to make this more flexible and I'd like to know what you think. I've defined a new factory (the name could change):

def Fourier(func, ftype, **func_kwargs):

    class Fourier(object):

        def __init__(self, psr):
            self._psr = psr

        def __call__(self, **kwargs):
            kwargs.update(func_kwargs)
            return selection_func(func)(self._psr, **kwargs)

        @property
        def ftype(self):
            return ftype

    return Fourier

This uses some of the fancier stuff that we are using from the selection, that is, if the underlying funcs arguments are attributes of the pulsar class then it will use them. This way we can still use generic functions without having to specifically write them to read in the Pulsar class (just make the arguments attributes of the Pulsar class).

So this could interact with FourierBasisGP as follows

pl = base.Function(utils.powerlaw, log10_A=parameter.Uniform(-18,-12), gamma=parameter.Uniform(1,7))
basis_dm = Fourier(utils.createfourierdesignmatrix_dm, 'dm')
basis_red = Fourier(utils.createfourierdesignmatrix_red, 'red')
rn = FourierBasisGP(pl, basis_red, selection=selection)
dm = FourierBasisGP(pl, basis_dm, selection=selection)

for DM and red power-law noise. Basically it would now require a spectrum and a basis. The type part of it is needed so that we could have different kinds. The cool thing is that we can also set arguments on the outside so if we wanted like Fourier(utils.createfourierdesignmatrix_red, 'red', fmin=1e-9, fmax=1e-7).

What do you all think?

jellis18 commented 7 years ago

So @vallis wrote on Slack the other day

I think the best approach is to decide what API we want, then figure out how to implement it. Thinking very generally, maybe we want an abstraction like BasisGP(priorFunction, basisFunction, selectionFunction)

Here basisFunction should return a vector and a matrix (the numerical "labels" of the basis vectors, and a matrix of the basis vectors). It's a good call to have basisFunction behave like Selection, in that it can access pulsar fields directly. But it should also work like a Function (e.g., carry parameters that can be overridden at sampling time, and whose names are specialized).

Can we write an object that does the job of both Function and Selection?

(The rest of our code will need changing because right now it assumes that Fmat never changes it. If we make it parameter dependent, we do need caching of some sort, but only on the parameters that change it.)

I think I have made the tools to do this:

First I have changed selection_func so that it can read in both standard arguments as well as keyword arguments. We may not need this but it is good to be more general.

I've also re-written the Function factory to be more selection-like. With these methods it should be possible to make the BasisGP class factory. I'll get on this soon but I wanted to give you all a heads up first to let me know what you think.

You can find an example notebook here, which should work with the current master branch as all the new functionality is defined in the notebook.

vallis commented 7 years ago

Before I get to Function, let me suggest a small tweak to selection_func. See the comment inline.

def selection_func(func):
    funcargs = inspect.getargspec(func).args

    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        targs = []
        pulsarargs = {}
        for ct, arg in enumerate(args):
            # this could lead to confusing errors if a positional argument
            # is given after a pulsar argument. An alternative could be to take
            # a "psr=..." keyword argument (instead of taking the pulsar as a
            # positional argument), and used that to fill named parameters
            # that are not obtained from *args or **kwargs

            if isinstance(arg, Pulsar):
                pulsarargs.update({funcarg: call_me_maybe(getattr(arg, funcarg))
                                   for funcarg in funcargs[ct:] if hasattr(arg, funcarg)})
            else:
                targs.append(arg)
        pulsarargs.update(**kwargs)
        return func(*targs, **pulsarargs)

    return wrapper
jellis18 commented 7 years ago

@vallis, that makes more sense. I'll make those changes.

vallis commented 7 years ago

Yup, it's still magic but perhaps less confusing

vallis commented 7 years ago

OK, here's how I would change Function:

# The idea to `Function` is to associate sampling parameters with user-defined functions.

def Function(func, name='', **func_kwargs):
    fname = name

    class Function(object):
        def __init__(self, name, psr=None):
            self._func = selection_func(func)
            self._psr = psr

            self._params = {}
            self._defaults = {}

            # divide keyword parameters into those that are Parameter classes,
            # Parameter instances (useful for global parameters),
            # and something else (which we will assume is a value)
            for kw, arg in func_kwargs.items():
                if isinstance(arg, type) and issubclass(arg, (parameter.Parameter, parameter.ConstantParameter)):
                    # parameter name template
                    # pname_[signalname_][fname_]parname
                    par = arg(name + '_' + ((fname + '_') if fname else '') + kw)
                    self._params[kw] = par
                elif isinstance(arg, (parameter.Parameter, parameter.ConstantParameter)):
                    self._params[kw] = arg
                else:
                    # we could also extract the value from parameter.ConstantParameter and store it here...
                    self._defaults[kw] = arg

        def __call__(self, *args, params={}, **kwargs):
            # order of parameter resolution:
            # - parameter given in kwargs
            # - named sampling parameter in self._params, if given in params dict
            #   or if it has a value
            # - parameter given as constant in Function definition
            # - default value for keyword parameter in func definition

            for kw, arg in func_kwargs.items():
                if kw not in kwargs and kw in self._params:
                    par = self._params[kw]

                    if par.name in params:
                        kwargs[kw] = params[par.name]
                    elif hasattr(par, 'value'):
                        kwargs[kw] = par.value

            for kw, arg in self._defaults:
                if kw not in kwargs:
                    kwargs[kw] = arg

            if self._psr is not None and 'psr' not in kwargs:
                kwargs['psr'] = self._psr

            return self._func(*args, **kwargs)

        @property
        def params(self):
            # if we extract the ConstantParameter value above, we would not
            # need a special case here
            return [par for par in self._params.values() if not
                    isinstance(par, parameter.ConstantParameter)]

    return Function

Then the API becomes

pl = Function(utils.powerlaw, log10_A=parameter.Uniform(-18,-12), gamma=parameter.Uniform(0, 7))

log10_Amp = parameter.Uniform(-10, -5)
log10_Q   = parameter.Uniform(np.log10(30), np.log10(3000))
t0        = parameter.Uniform(psr.toas.min(), psr.toas.max())
fourier_env = Function(createfourierdesignmatrix_env, t0=t0, log10_Amp=log10_Amp, log10_Q=log10_Q)

basis = fourier_env(psr.name, psr=psr)
spectrum = pl(psr.name, psr=psr)

params = {'B1855+09_log10_A': -14,
          'B1855+09_gamma': 4.33,
          'B1855+09_log10_Amp': -8,
          'B1855+09_log10_Q': np.log10(300),
          'B1855+09_t0': 4783728350.5632496}

Tspan = psr.toas.max() - psr.toas.min()
F, f = basis(params=params, Tspan=Tspan)
psd = spectrum(f, params=params)

...and note this uses the modified selection_func, something like:

def selection_func(func):
    funcargs = inspect.getargspec(func).args

    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        targs = list(args)

        if len(targs) < len(funcargs) and 'psr' in kwargs:
            psr = kwargs['psr']

            for funcarg in funcargs[len(args):]:
                if funcarg not in kwargs and hasattr(psr, funcarg):
                    targs.append(call_me_maybe(getattr(psr, funcarg)))

        if 'psr' in kwargs and 'psr' not in funcargs:
            del kwargs['psr']

        return func(*targs, **kwargs)

    return wrapper
jellis18 commented 7 years ago

Did you actually run this? I get the error

  File "<ipython-input-157-0a9e4022e932>", line 74
    def __call__(self, *args, params={}, **kwargs):
                                   ^
SyntaxError: invalid syntax
vallis commented 7 years ago

Ah, I used a Python 3 feature there (keyword-only arguments). This should work

def __call__(self, *args, **kwargs):
    params = kwargs.get('params',{})
    if 'params' in kwargs:
        del kwargs['params']

In either case one needs to call the function by giving params= explicitly, just as we would give psr= explicitly.

jellis18 commented 7 years ago

Ok I figured it was some Python 3 nonsense :)

jellis18 commented 7 years ago

Ok I've incorporated your changes and made a first stab at a BasisGP factory in my test notebook. It still needs caching but it at least seems to work for both Fourier and ECORR basis types. Let me know what you think.

jellis18 commented 7 years ago

Ok, I think I've gotten the caching working and we could probably use this same decorator for caching other things as well, like white noise get_ndiag methods when we fix the white noise. Checkout the notebook.

Here is the caching function

def cache_call(attr, limit=10):  

    def cache_decorator(func):
        cache = {}
        cache_list = []

        def wrapper(self, params):
            keys = getattr(self, attr)
            key = tuple([(key, params[key]) for key in keys if key in params])
            if key not in cache:
                cache_list.append(key)
                cache[key] = func(self, params)
                if len(cache_list) > limit:
                    del cache[cache_list.pop(0)]
                return cache[key]
        return wrapper

    return cache_decorator

It takes a string argument to pull out a subset of parameters and it also has a limit so the cache never gets too big.

jellis18 commented 7 years ago

OK, this cache doesn't work with multiple signals because it defined the cache dictionary per class not per instance. I had to change it like this:

def cache_call(attr, limit=10):
    """Cache function that allows for subsets of parameters to be keyed."""

    def cache_decorator(func):

        def wrapper(self, params):
            keys = getattr(self, attr)
            key = tuple([(key, params[key]) for key in keys if key in params])
            if key not in self._cache:
                self._cache_list.append(key)
                self._cache[key] = func(self, params)
                if len(self._cache_list) > limit:
                    del self._cache[self._cache_list.pop(0)]
                return self._cache[key]
        return wrapper

    return cache_decorator

Now the cache is part of the instance. At the moment it is initialized when __init__ is called but we could initialize it in the cache if it isn't found to begin with. Does anyone know a better way of doing this?

jellis18 commented 7 years ago

This was taken care of in PR #95; however, now I'm thinking that it isn't general enough. The BasisGP class assumes that the prior on the GP coefficients is diagonal. It may be better if this class would allow for a dense matrix but check to see if it is 1-d or 2-d before inverting so we can save speed.

jellis18 commented 6 years ago

This is now fully implemented in PR #132.