PAHFIT / pahfit

Model Decomposition for Near- to Mid-Infrared Spectroscopy of Astronomical Sources
https://pahfit.readthedocs.io/
18 stars 26 forks source link

Output fit spectrum? #248

Closed karinsandstrom closed 1 year ago

karinsandstrom commented 1 year ago

It does not seem like there is an easy way to output the fit spectrum to a file. That would be very helpful!

drvdputt commented 1 year ago

So, there are a couple of ways we could do this

  1. Return a functional that gives you flux when you evaluate it at a (grid of) wavelengths. (Same as sub_model). I have this in the demo notebook
    # the instrument and redshift are still required here, to keep only the features that were fit in the data range
    dust_feature_function = model4.sub_model(instrumentname=spec.meta['instrument'], redshift=spec.redshift, kind='dust_feature')

    Some extra code is needed to write this out

if you want a spectrum1D saved as fits

oneplusz = 1 + z wrest = spec.spectral_axis.value / oneplusz pah_spectrum = Spectrum1D(spectral_axis=wrest spec.spectral_axis.unit, flux=dust_feature_function(wrest) spec.flux.unit) pah_spectrum.write('region_3_miri_pah.fits', format='tabular-fits')

for ascii, workaround through astropy table

from astropy.table import Table t = Table() t.add_column(pah_spectrum.spectral_axis, name='WAVELENGTH') t.add_column(pah_spectrum.flux, name='FLUX') t.write('region_3_miri_pah.ecsv', format='ascii.ecsv', overwrite=True)

2. Return a grid of fluxes or a Spectrum1D object, given a grid of wavelengths

something like

fluxes = model.evaluate(spec or wavs)


3. Return nothing and just write to a file directly.

I feel like we'll need at least two of these. Number 1 and/or 2 would be good for making custom plots. Number 3 would be nice so that the users don't have to figure out the writing op spectrum1d/astropy.
jdtsmith commented 1 year ago

We should follow the wiki here and make a little method like Model.get_model that can take either a Sepectrum1D or just an array of wavelengths, to evaluate the model (and, optionally, a kind from the Table). We don't want the astropy objects "leaking" out, since they may go away soon.

drvdputt commented 1 year ago

So the wiki has these things

pah7 = om.get_fit(group='PAH77') # w/ default sampling
neIII = om.get_fit('NeIII', samples=nsamp) # Just one line, by name

So this will return a discretized version of (selected components of) the fitted model.

Some comments:

The available formats are:

Format Read Write Auto-identify


tabular-fits Yes Yes Yes wcs1d-fits Yes Yes Yes

jdtsmith commented 1 year ago

Good points. One way to fix the default problem is to insist on being given a Spectrum1D or a wavelength array of any shape. I mean np.linspace(min, max,.05) is easy enough for people I bet. In terms of writing, FITS may be fine, or people can make custom writers, e.g. just using Table.

jdtsmith commented 1 year ago

Another simplification is to just use all the rich selection Table already provides, then just have get_model() as a simple method. So

fit=model.fit(spec,...)
just_77 = fit[fit['group']='PAH77'].get_model(spec)

Otherwise we are just reinventing how to sort/select Table. I don't recall, does such pass-through work now (I think this was among the arguments for pahfit.Model isa Table).

drvdputt commented 1 year ago

Well, at the moment model.fit() doesn't return anything, and it's not clear to me what the difference between get_model and get_fit is supposed to be. So not sure how to interpret the above.

The wiki has this

pahh11 = om.get_model('PAH11.0', samples = nsamp) # Model of individual feature, by name

# Grouped features are specified in the sci-pack.  
pah7 = om.get_fit(group='PAH77') # w/ default sampling

so it look like they're doing the same to me.

I think a more meaningful name would be "export_fit" or "tabulate_fit". I do see the use of having a flexible way of selecting what parts of the model to export though. E.g. "all the lines of group "H lines" between wavelengths x and y".

drvdputt commented 1 year ago

If we don't want to put those types of options into the function arguments, the current workaround is something like this

model_sliced = Model(model.features[model.features['group'] == 'PAH77')
just_77 = model_sliced.get_fit(wavelengths)
jdtsmith commented 1 year ago

at the moment model.fit() doesn't return anything

Doh, right. Ignore that. I don't think it needs to return anything.

it's not clear to me what the difference between get_model and get_fit is supposed to be

Don't read the wiki too closely, that was before the true power of the Table became clear. Maybe we can provide a getitem so users can index a Model object "just like a table". We're already passing the repr through (with additions). There are probably other pass-throughs needed too (like group_by). Maybe we should just subclass after all.

drvdputt commented 1 year ago

I though about having a __getitem__() passthrough briefly. The problem with that is that the behavior is rather complex. If a mask is passed, then the returned value should be a new Model object, but if a column name is passed, then it should return a column, etc.

Edit: although maybe trying this turns out to be simpler than expected. Just return self.features[index], but if self.features[index] is a table with all of the same columns, return Model(self.features[index])

Of course, subclassing will solve that inherit all of the behavior. But there might be some problems with subclassing, when we go back to the idea of making Model a combination of Features and instrument. Then we might need a wrapper, to make sure that instrument (and other info) is preserved. (unless we put everything in the Table metadata, or the Table implementation has some magic to copy over all other attributes?)

We could try making a small proof of concept to try this out.

Edit 2: I had a thought on why a Model does not perfectly map to a Table. With tables you can do something like this model.features['name', 'kind'] and this will return another table. If Model was a subclass of table, then this would return a Model with only those columns, and everything will break.

So only row-wise slices/maskings of the original table are still Models. Not any other type of slicing.

jdtsmith commented 1 year ago

All good points. A "single column model" makes no sense, I agree.

So perhaps keep model.features as it is (I will promise that this is the last time I'll make you talk me into this.. again ;).

For retrieving the model, I like tabulate. So something like:

model.fit(...)
model.tabulate()  # -> tabulated model as a Spectrum1D object at samples pahfit picks, "matching instrument pack" from most recent fit
model.tabulate(spec)  # -> tabulated model as a Spectrum1D at spec's wavelengths
model.tabulate(waves)  # -> tabulated model as an array at wavelength waves, the same shape as waves
model.tabulate(..., feature_mask=model.features['wavelength']>8.5) # -> same but row mask selection applied first
model.tabulate(..., feature_mask=model.features['group']=='PAH_7.7_cmp') # -> or this
model.tabulate(..., components=True) # Same but as a dictionary with "most common components" broken out (i.e. those we plot: starlight, continuum, dust_features, lines, attenuation_vec).  Like pahfit_components
model.tabulate(..., features = model.features.loc['silicate']) # maybe we don't need this?
jdtsmith commented 1 year ago

For the last case an advanced user can just make their own new Model, yes?

drvdputt commented 1 year ago

Yes, that can always be done (but then, information "from the last fit" that is not in the table can not be used of course). But I think the feature_mask argument should already encompass everything.

I'm working on a draft of this function. There are some extra issues that become clear trying to implement this. Things to think about include

I think this will be easier to discuss once you've seen the draft, so I'll put in a pull request.

jdtsmith commented 1 year ago

Sounds good I'll have a look.

jdtsmith commented 1 year ago

Closed by #249