spacetelescope / mosviz

MOS Visualization Tool
http://mosviz.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
5 stars 17 forks source link

Allow data-agnostic loaders #39

Closed nmearl closed 6 years ago

nmearl commented 8 years ago

MOSViz uses hard coded component ids in the back with the expectation that the data is strictly the nircam data set.

We're hoping that a single data loader that's exposed in the Glue interface can wrap the process of calling the three functions required to load the specific data.

Any ideas @mattjhill, @astrofrog?

mattjhill commented 8 years ago

My initial thought was that people would write their loaders (in .glue/config.py?) and then specify the loaders in the table. You might need to tell users to name the components specific things ("Wavelength", "Spectral Flux", etc) but I don't really see any way around that.

You could just as easily do this with a single function which accepts three filenames and returns three Data objects, so it would only need one more column in the table.

As an example in the the load_selection() function you could do something like this (ignoring checking if the rows/functions actually exist)

def load_selection(self, row):
    ...
    spectrum1d_loader = getattr(config, row["spectrum1d_loader"])
    spec1d_data = spectrum1d_loader(row['spectrum1d'])

Is this along the lines of what you were thinking?

nmearl commented 8 years ago

Is this along the lines of what you were thinking?

Sort of. I don't think the loader should be specified for each row, that seems unnecessarily tedious for the user to have to construct, since there'd now be three new columns that are pretty tangental to the real data the user cares about.

Instead, how about we define the table as having to have header keywords that identify the function names:

# %ECSV 0.9
# ---
# meta:
#   loaders:
#       spec1d: nirspec_spectrum1d_reader
#       spec2d: nirspec_spectrum2d_reader
#       image: acs_cutout_image_reader
# datatype:
# - {name: id, datatype: string}
# - {name: ra, unit: deg, datatype: float64}
# - {name: dec, unit: deg, datatype: float64}
# - {name: spectrum1d, datatype: string}
# - {name: spectrum2d, datatype: string}
# - {name: cutout, datatype: string}
# - {name: slit_width, unit: arcsec, datatype: float64}
# - {name: slit_length, unit: arcsec, datatype: float64}
# - {name: pix_scale, datatype: float64}
id ra dec spectrum1d spectrum2d cutout slit_width slit_length pix_scale
nircam_17261 53.182575 -27.768171 spec1D/Final_spectrum_MOS_1_015_107_CLEAR-PRISM_MOS_PRISM-observation-2-c0e0_000.fits spec2D/2Dspectrum_MOS_1_015_107_CLEAR-PRISM_MOS_PRISM-observation-2-c0e0_000.fits cutouts/17261_jwst_nircam_f365w_cutout.fits 0.2 3.3 0.66
nircam_13956 53.186939 -27.790991 spec1D/Final_spectrum_MOS_1_015_108_CLEAR-PRISM_MOS_PRISM-observation-2-c0e0_000.fits spec2D/2Dspectrum_MOS_1_015_108_CLEAR-PRISM_MOS_PRISM-observation-2-c0e0_000.fits cutouts/13956_jwst_nircam_f365w_cutout.fits 0.2 3.3 0.66
mattjhill commented 8 years ago

That looks good to me. Do you (or @astrofrog) know if the metadata is (or can be) read into the glue Data object for the table?

astrofrog commented 8 years ago

I'm not sure if the ECSV units are read into glue at the moment, but it would be easy to fix if not. The FITS table reader already deals with the units correctly so it's a matter of copying what is done there unit-wise. I can look into it early next week, but can also help if anyone else wants to try it out today.

mattjhill commented 8 years ago

From the glue source it looks like the astropy_table data factories read in the units, but there is also a "meta" attribute in the Table object which I don't think is being passed to the Data object. That's where the loader info would go in the table @nmearl put above.

A quick and dirty fix would be to just add result.meta = table.meta in the astropy_tabular_data function. I don't if that's a "good" way to do it though.

astrofrog commented 8 years ago

There's no reason we can't add Data.meta as an official property in glue. Would that solve the problem here?

mattjhill commented 8 years ago

Yup that would solve it.

astrofrog commented 7 years ago

Just an update on this, glue now reads in ECSV files and correctly stores the metadata in Data.meta. I'll see if I can then hook that up to the data loading.

astrofrog commented 7 years ago

I'm a little worried that this solution requires the data table to be in ECSV format - I'm going to have a think about whether we can come up with a more general solution.

astrofrog commented 7 years ago

Here is a possible solution - we don't encode information about the readers in the table (since users would have to edit the table metadata to read it which seems not ideal) but instead when the table is loaded with the MOSViz loader, we pop up a window asking users to select which readers to use for 1D, 2D spectra and image cutouts. To avoid listing all data factories in each dropdown menu for the loaders, we create three decorators that users would use when defining their functions, e.g.:

@mosviz_spectrum_2d_loader
@data_factory('DEIMOS 2D Spectrum')
def deimos_spectrum2D_reader(file_name):
    """

which then allows us to narrow down the list of data factories we show. I'm going to work on a proof of concept since I don't think it should be very difficult (and we can always decide later whether we want to go ahead with that).

astrofrog commented 7 years ago

(I see that the idea proposed in this issue is already implemented, but I'd still like to explore the way I was suggesting above since I think it would be more user-friendly, and the two methods are not mutually exclusive)

astrofrog commented 7 years ago

See https://github.com/spacetelescope/mosviz/pull/57

astrofrog commented 6 years ago

Now that https://github.com/spacetelescope/mosviz/pull/57 is merged I think we've basically addressed the original issue here.