Starfish-develop / Starfish

Tools for Flexible Spectroscopic Inference
https://starfish.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
68 stars 22 forks source link

`grid_tools.py` needs some simple API methods #64

Closed iancze closed 5 years ago

iancze commented 8 years ago

Currently, a lot of the grid_tools codebase has been written in support of the end-goal of spectroscopic inference. However, there are also a lot of useful routines in here that could be extracted out of some of the larger classes (e.g., HDF5Creator) and provided as stand-alone methods, either as module-level methods or methods of RawGridInterface. Specifically, I am thinking of the ability to

These are all very simple tasks, but I frequently find myself in the position where I will need to do this for one or two spectra for a plot or just quick playing around, and I don't want to have to go through the hassle of creating a whole grid of spectra when I just need one.

mileslucas commented 5 years ago

I'm going to create a branch and actually work on a few of these things. Here is my current idea-

current

Starfish/
  --grid_tools.py # Everything 

proposed

Starfish/
   - grid_tools/
     - instruments.py # Instrument Classes
     - interfaces.py # Interfaces
     - utils.py # Wavelength and conversion utilities
     - interpolators.py # Interpolator(s)

I think this helps reorganize one of the larger files in the project into logical groups. A big question I have is where to put utility methods that might be useful for both grid_tools as well as spectrum. I have a feeling this should exist at some base Starfish/utils.py file.

iancze commented 5 years ago

This sounds like a good organization to me. A base utils.py file also sounds like a good idea.

Maybe @jason-neal can weigh in on whether this makes sense for the application that they are using grid_tools.py for.

jason-neal commented 5 years ago

Yes the organization seems fine to me. At the moment I am only using the interfaces.

It would be handy if the interfaces could still be imported straight from grid_tools.

For the initial comment already have implemented that functionality myself. (convolutions, doppler shift etc.). The convoutions can be found here if inspiration is needed.

mileslucas commented 5 years ago

@iancze I'm considering a couple different ways of adding that extra functionality, here is what I like right now.

Stateful and lazy

This approach would add methods to all the interface classes as well as the data spectrum. They would mostly look like this-

from Starfish.transforms import broaden

class RawGridInterface
    def __init__(...):
         self.transform = None

    def broaden(self, inst_v=None, vz=None, vsini=None):
        self.transform = lambda f: broaden(f, inst_v, vz, vsini)

    def load_flux(self, ...):
        # after getting the flux
        if self.transform is not None:
            flux = self.transform(flux)
        return flux

This would mean calling broaden or resample would change the state of the interface but would have no effect on the actual underlying fluxes (the ones written to the grid). The actual transforms would only occur when the flux is loaded or when the wavelength is loaded. The downside is that this is calculated each time, but we could add a flag that says deep=False where, if True, it would apply the effects to the underlying grid. For example, for our HDF5Creator we essentially apply instrumental broadening and a truncation deeply.

This would also still have the functionality of the utilities provided int Starfish.utils as functions that are just helpers for doing the calculations.

iancze commented 5 years ago

I think this is a nice solution. One use case I'm thinking of is a first-order "by eye" fit of a star, where either someone's playing around with the parameters to get a ballpark guess or maybe they are plotting up what the "best-fit" parameters from the literature look like.

In this case they don't really need to interpolate the grid, since the coarse spacing is probably fine as regards fundamental stellar parameters, but they do want to tweak things like v_z and vsini (what we called "extrinsic stellar parameters" in Czekala et al.

The one thing that might be tricky to keep track of is propagating which settings were used when using HDF5Creator with deep=True. I suppose these can be written as metadata to the HDF5 file.

Within the MCMC loop itself (in the infamousparallel.py) the broadening operations are merged into a single kernel to save another round trip from the Fourier domain. That may or may not make sense here as well, depending on how you envision utils stacking together.

mileslucas commented 5 years ago

For the last point- I think for the convolution we should define it like I have with arguments for each type of kernel and then creating a kernel from that.

In fact, a great solution would be to create a class like this-

class Broaden(Transform):
    def __init__(self, *the_relevant_args):
        self.relevant_args=the_relevant_args
        self.kernel = calculated_kernel_from_relevant_args

    def __call__(self, wave, flux):
        return self._broaden(wave, flux, self.kernel)

    def broaden(self, wave, flux, relevant_args):
        # calculate one-time kernel given relevant_args
        return self._broaden(wave, flux, kernel)

    def _broaden(self, wave, flux, kernel):
        # actually do the math
        return convolved_flux

This would allow one-time calls via the Broaden.broaden method but would also allow creating a function that will have a consistent kernel for playing around.

I'm also curious reading through the code- why is there a preference to multiply kernels in Fourier domain instead of convolving in regular domain? I understand it is mathematically the same, but I generally would use convolution from astropy or scipy rather than a home-brew solution.

iancze commented 5 years ago

The kernels are multiplied in the Fourier domain because I am a radio astronomer by training 😃

I'm only half joking there, but I think the real reason is that I was playing around with doing it in the wavelength domain and it was both slow (compared to the FFT) and a bit harder to check everything from a signals and Nyquist sampling standpoint. My understanding was that there is something of a precedent to do broadening operations in the Fourier domain too (e.g., see TODCOR or Tonry 79: https://ui.adsabs.harvard.edu//#abs/1979AJ.....84.1511T/abstract).

mileslucas commented 5 years ago

Oh fun, I had an REU at NRAO in Socorro last summer. I learned a painful amount of information about aperture synthesis imaging haha. In such a case, I'll keep your original methods, which I assume work great. much further down the line, it could be worth doing timing tests but functionality right now has the precedent.

mileslucas commented 5 years ago

@iancze I am running into some trouble with the math. Please take a look at Starfish/transforms.py on the branch refactor/spectrum

I have created two super classes of transforms- Transform and FTransform. The first is an abstract that assumes all calls to it will be called with normal wavelength and flux and will return a normal wavelength and flux. The second is designed to work on wavelengths in the fourier domain, but otherwise has the same callsign as Transform. Does this seem like a good approach?

I have a class called Broaden that is an FTransform and I just am not sure if I have the math right. Similaraly I'm not really sure how resampling should work. I have based these classes on the code from both HDF5Creator.process_flux() and Order.updateTheta() from parallel.py.

If you can think of any good unit tests, feel free to add them to tests/test_transforms.py :)

Edit: The style I’ve adopted here is extremely similar to the PyTorch nn.Module, which is great when we end up moving more of the code over. The basic principal is that I can string many modules together in a sequential order, effectively making a computation graph, anld give my wave and flux at the start and get my fully transformed wave and flux at the end.

For example

wave, flux ->  loglambda -> broaden -> resample-> invloglambda -> extinct -> chebyshev

Is the kind of sequential operations performed during a Theta optimization. My current implementation can do this pretty easily just using helper functions but in PyTorch there is a lot of power using torch.nn.Sequential which does links together many modules in a sequential list, exactly like I’ve shown above, allowing us to write the entire update theta code in a few lines. In addition we can still use a PyTorch implementation extremely easily with numpy data so we could use it equally in a sampler and in an exploratory widget.

iancze commented 5 years ago

Ok I had a chance to take a look.

What is fwave? (e.g. line 122)? If this is the Fourier coordinate then I'm not sure that it's appropriate to be Doppler shifting the coordinate this way. If you did want to do this in the Fourier domain then what you'd want to do is apply a phase shift to the complex values themselves, but I think it's just as easy to do it in the wavelength domain.

The math was setup so that the only things going on in the Fourier domain were the broadening operations for the line spread function (represented by a Gaussian) and the vsini broadening kernel (parameterized from Grey's textbook). Rather than a convolution in wavelength, these are carried out as multiplications of the Fourier Transform of the kernels with the FT of the spectra in the Fourier domain (line 142). This meant that the round trip from real to Fourier to real space should result in arrays with the same length (no resampling done yet).

The resampling and Doppler shifting come afterwards, in the wavelength domain. The resampling is just taking advantage of the Nyquist theorem to say if we have convolved our spectrum with a broad kernel, then we don't need to sample it so often (i.e., densely spaced wavelengths) to retain all of the necessary information. This makes sense in the HDF creator step where the raw spectra are at R=500,000 and we only need to save them at 40,000, since it saves a lot of space and computational time.

The PyTorch adaptation sounds cool.

mileslucas commented 5 years ago

Okay, I think fwave is supposed to be the wavelength returned from calculate_wl_dict- which is just a log-lam spaced normal wavelength? I'm not certain.

If that's the case, we should realistically be calculating the log-lam at the beginning and never reverting assuming it has the same shape as the original wl grid.

iancze commented 5 years ago

Ah this is a somewhat embarrassing part of the code. Let me try to explain the various things:

a wl_dict is a dictionary of log-lambda spaced wavelengths but also metadata about that spacing (as one might find in FITS header keywords). It is generated by calculate_wl_dict (link)

calculate_dv is a helper method to determine what the maximal sampling rate (or minimum velocity spacing, dv) of a current spectrum is, say, a raw spectrum from the PHOENIX grid, which may or may not be log-lambda spaced. That way we can use calculate_wl_dict to make a new wavelength grid, which is log-lambda spaced and satisfies the Nyquist sampling theorem, onto which we can interpolate the raw PHOEINX spectrum and begin doing the FFT operations.

mileslucas commented 5 years ago

So, for the purpose of these simple transforms we should assume log-lambda spacing. So if we want to write transforms specifically in the Fourier domain we want to have the functions take in the ss (in the code currently) or the Fourier wave components and return (ss, flux). We would create ss by initially calling np.fft.rfftfreq(wl) or similar and finally do...something? to get the Fourier back to the normal domain?

iancze commented 5 years ago

I suppose it depends on the use case. If this a general use case, then we might not want to assume log-lambda spacing. I think it will be hard to test for, unless we require a specific kind of object or something. I envision most users would see this as an easy method that takes a spectrum they have and broadens it, without needing to worry how their grid is sampled.

Within the actual inference loop, we will definitely want to assume log-lambda spacing and pre-cache whatever we can.

mileslucas commented 5 years ago

If we set up like this-

class Transform:
    def __call__(wave, flux):
        return self._transform(wave, flux)

    def _transform(wave, flux):
        '''assumes log-lambda'''
        # Do math
        return wave, transformed_flux

def my_transform(wave, flux):
    '''does not assume log_lambda'''
    ll_wave = gimme_log_lambda(wave)
    temp_t = Transform()
    return temp_t(ll_wave, flux)

we can have optimized code for our high-intensity work by using the Module approach but have flexibility for times when the code is more utility than Module.

mileslucas commented 5 years ago

Okay, I've written a few transforms that I think follow the math correctly. @iancze double check, please.

I have put these into Starfish/transforms.py and the usage is like this

Here is an example of the usage within the HDF5Creator-

def __init__(...):
    .
    .
    .
    self.resample_loglam = Resample(self.wl_loglam)
    self.inst_broaden = InstrumentalBroaden(self.instrument)
    self.resample_final = Resample(self.wl_final)

def process_flux(parameters, header=False):
    flux, header = self.GridInterface.load_flux(parameters, header=True)
    _, flux_loglam = self.resample_loglam(self.wl_native, flux)
    _, flux_tapered = self.inst_broaden(self.wl_loglam, flux_loglam)
    _, flux_final = self.resample_final(self.wl_loglam, flux_tapered)

    return (flux_final, header)

In the above example you can see how, since we're only doing one broadening action, we use the normal __call__ method to avoid having to write out the Fourier transforms.

We could even go further and implement it very simply as

self.transform = lambda flux: resample_final(*inst_broaden(*resample_loglam(self.wl_native, flux)))

For fun I timed the previous implementation against my one-liner above and previously the HDF5Creator, while processing the grid, took about 2.89 s/it and now it took 1.78 s/it so a very slight change.

Edit:

I've added some docs- Ian you should be able to see them here

Edit2: I've created some unit tests for all the classes. Most of them just make sure that everything can be run but I don't have any clear tests for testing functionality- especially for the broadening transforms. I was able to sort of test doppler shifting by seeing if all the redshifted wavelengths were greater than the stationary. Are there any ways of unit-testing these transformations that you can think of?

Edit3: added a CalibrationCorrect transform. Can you check it makes sense, too? It has docs and BASIC tests with it, too.

mileslucas commented 5 years ago

So this development has become inextricably coupled with the remodel of the likelihood model. Primarily, the backend will determine the structure of the operations. Right now I am working on PyTorch implementation in lieu of a theano implementation (which would be the requirement for PyMC3) due to theano's limited lifetime.

Given that, I will close this issue and will incorporate my transformation changes into the modeling refactor.