PAHFIT / pahfit

Model Decomposition for Near- to Mid-Infrared Spectroscopy of Astronomical Sources
https://pahfit.readthedocs.io/
18 stars 26 forks source link

Channels, adding features, plotting #254

Closed sophfb closed 1 year ago

sophfb commented 1 year ago

Hello! I am an undergraduate student at UMass Amherst working with Professor Alex Pope and have been using PAHfit for awhile to fit JWST mid infrared spectra. J. D. told us to direct our questions to here about the new API.

I first was wondering if there is a way to fit to only to a certain amount of channels. For example, we have data for all four channels of JWST miri data, but since channel four is noisy, we only want to fit channels one to three.

Another question is about the yaml science pack files. Should we be modifying these to create our own or should we wait for them to be released with the new version of PAHfit? Right now we are adding lines ourself but are wondering if this is a good use of time. Our spectrum needs lines that are <5 microns since our galaxies are redshifted to z~0.6.

Lastly, I was confused on the plotting features of the program. Is there a resource on how to use the sub_model function or just the plotting functions in general?

karllark commented 1 year ago

Hello! I am an undergraduate student at UMass Amherst working with Professor Alex Pope and have been using PAHfit for awhile to fit JWST mid infrared spectra. J. D. told us to direct our questions to here about the new API.

Hello!

I first was wondering if there is a way to fit to only to a certain amount of channels. For example, we have data for all four channels of JWST miri data, but since channel four is noisy, we only want to fit channels one to three.

You should be able to do this by not including the ch4 data in the fit.

Another question is about the yaml science pack files. Should we be modifying these to create our own or should we wait for them to be released with the new version of PAHfit? Right now we are adding lines ourself but are wondering if this is a good use of time. Our spectrum needs lines that are <5 microns since our galaxies are redshifted to z~0.6.

I look forward to hearing from others on this topic. @jdtsmith There is a pack that have features below 5 micron - checkout scipack_ExGal_AkariIRC+SpitzerIRSSLLL.ipac

Lastly, I was confused on the plotting features of the program. Is there a resource on how to use the sub_model function or just the plotting functions in general?

Here again, I am not the expert. @drvdputt ?

jdtsmith commented 1 year ago

Sorry PAHFIT is still in a bit of disarray as we clean things up. We need to port the old .ipac sci pack to a new gal.yaml model. I think @ThomasSYLai was going to work on that. YAML scipacks are the future, once the dev branch lands. Right now we are in a phase were we should be starting to tweak our YAML files for JWST. So a good approach would be to start with gal.yaml, and see where the fits are having problems. Then you can try/propose tweaks. It's fairly easy to do the tweak in the Features table (an astropy table in disguise) and iterate without having to actually write and rewrite the YAML file itself. We also need better docs on all this. Lots to do.

drvdputt commented 1 year ago

I first was wondering if there is a way to fit to only to a certain amount of channels. For example, we have data for all four channels of JWST miri data, but since channel four is noisy, we only want to fit channels one to three.

If you have the merged spectrum as a Spectrum1D object, then you can slice it as explained here https://specutils.readthedocs.io/en/stable/spectrum1d.html#slicing:

from astropy import units as u
s = <your channel 1-4 spectrum>
s_new = s[:18 * u.micron]

You can then choose the right instrument setting using

s_new.meta['instrument'] = "jwst.miri.mrs.[123].*"

Lastly, I was confused on the plotting features of the program. Is there a resource on how to use the sub_model function or just the plotting functions

sub_model might be a bit too abstract for most use cases. It will be replaced with a more straightforward function called tabulate, from pull request #249, and I hope that can get merged in soon. But as @jdtsmith stated above, we are still cleaning things up. After we merge some of these features, it should become more straightforward to contribute extra utilities such as more flexible plotting. Currently, the only plot function is Model.plot().

sophfb commented 1 year ago

Sorry PAHFIT is still in a bit of disarray as we clean things up. We need to port the old .ipac sci pack to a new gal.yaml model. I think @ThomasSYLai was going to work on that. YAML scipacks are the future, once the dev branch lands. Right now we are in a phase were we should be starting to tweak our YAML files for JWST. So a good approach would be to start with gal.yaml, and see where the fits are having problems. Then you can try/propose tweaks. It's fairly easy to do the tweak in the Features table (an astropy table in disguise) and iterate without having to actually write and rewrite the YAML file itself. We also need better docs on all this. Lots to do.

I looked in the science folder and couldn't find gal.yaml. Is it located somewhere else?

sophfb commented 1 year ago

You can then choose the right instrument setting using

s_new.meta['instrument'] = "jwst.miri.mrs.[123].*"

I tried this and got this error: PAHFITPackError: Could not locate instrument segment jwst.miri.mrs.[123].* Since the channels are labeled as 'ch2, ch2, ch3', I tried substituting that in for the '123' but that didn't work either.

sophfb commented 1 year ago

Sorry PAHFIT is still in a bit of disarray as we clean things up. We need to port the old .ipac sci pack to a new gal.yaml model. I think @ThomasSYLai was going to work on that. YAML scipacks are the future, once the dev branch lands. Right now we are in a phase were we should be starting to tweak our YAML files for JWST. So a good approach would be to start with gal.yaml, and see where the fits are having problems. Then you can try/propose tweaks. It's fairly easy to do the tweak in the Features table (an astropy table in disguise) and iterate without having to actually write and rewrite the YAML file itself. We also need better docs on all this. Lots to do.

I looked in the science folder and couldn't find gal.yaml. Is it located somewhere else?

If I make a new yaml file that includes all of the shorter wavelength (<5um) lines, do I put Br Alpha in the ionic line grouping?

jdtsmith commented 1 year ago

It doesn't yet exist. I think @ThomasSYLai was planning to create the new gal.yaml, but since that hasn't happened, let us do it and you can propose tweaks an additions to it. I'll be at STScI this week and Dries and I are working to merge the dev branch to main, so that should make all of this much simpler.

drvdputt commented 1 year ago

I tried this and got this error: PAHFITPackError: Could not locate instrument segment jwst.miri.mrs.[123].* Since the channels are labeled as 'ch2, ch2, ch3', I tried substituting that in for the '123' but that didn't work either.

Sorry about that. It's actually "jwst.miri.mrs.ch[123]*". When in doubt, you can always pass a list instead of a shorthand string. To make it extra clear, you can start with:

from pahfit import instrument
print(instrument.instruments())

which will list all possible combinations, and can also be used to test e.g. instrument.instruments("jwst*"). The output for "jwst*" gives

['jwst.nirspec.prism',
 'jwst.nirspec.g140.medium',
 'jwst.nirspec.g140.high',
 'jwst.nirspec.g235.medium',
 'jwst.nirspec.g235.high',
 'jwst.nirspec.g395.medium',
 'jwst.nirspec.g395.high',
 'jwst.miri.mrs.ch1.A',
 'jwst.miri.mrs.ch1.B',
 'jwst.miri.mrs.ch1.C',
 'jwst.miri.mrs.ch2.A',
 'jwst.miri.mrs.ch2.B',
 'jwst.miri.mrs.ch2.C',
 'jwst.miri.mrs.ch3.A',
 'jwst.miri.mrs.ch3.B',
 'jwst.miri.mrs.ch3.C',
 'jwst.miri.mrs.ch4.A',
 'jwst.miri.mrs.ch4.B',
 'jwst.miri.mrs.ch4.C']

​so you could just copy paste part of the above output and put them in a list like so

s_new.meta['instrument'] = ['jwst.miri.mrs.ch1.A',
 'jwst.miri.mrs.ch1.B',
 'jwst.miri.mrs.ch1.C',
 'jwst.miri.mrs.ch2.A',
 'jwst.miri.mrs.ch2.B',
 'jwst.miri.mrs.ch2.C',
 'jwst.miri.mrs.ch3.A',
 'jwst.miri.mrs.ch3.B',
 'jwst.miri.mrs.ch3.C',]
drvdputt commented 1 year ago

If I make a new yaml file that includes all of the shorter wavelength (<5um) lines, do I put Br Alpha in the ionic line grouping?

The groupings are there to offer conveniences e.g. to filter the results table after the fitting or help with custom plots. In one of the PDR packs we're developing, we have put the H I lines in a separate grouping, which looks like this.

 atomic_H_lines:
     kind: line
     bounds:
         # optional in case of wavelength calibration issues
         # wavelength: [-0.03%, 0.03%]
     wavelength:
         Pf_gamma:       3.740555
         Br_alpha:          4.05226
         Pf_beta:           4.65378
... (many more)

Groups are convenient because you need to specify the "kind" and "bounds" settings only once.

sophfb commented 1 year ago

I have a question about the units of the outputted power, in the documentation it says the power should be in [units given] x Hz, however, were finding that our values are off by about 10^-9 to 10^-10 when comparing against values that we know are right. We were trying to convert the PAHfit powers which should have been in Jy x Hz as per the documentation to W/m^2 to compare which is where this problem arose. We speculated that this could be because PAHfit is doing the integral for power in wavelength space instead of frequency space. Does anyone know why this might be?

I also have a quick question about the plotting, is there a way to make PAHfit show more than 18 microns when plotting? Even though I am putting in spectra that go to about 28 microns, it will only show up to 18 on the plots.

drvdputt commented 1 year ago

First of all, I'd like to share some news since our last conversation: the dev branch has been merged into master. So it would be best if you check out the master branch.

The units are currently in an annoying state. While the column name for the features is "power", we're actually still fitting the amplitude. So the units of "power" are currently equal to the units of the input spectrum, Jy in your case. You will have to convert your value for the power into an amplitude unit for now, if you want to compare. For example, for a Gaussian (the emission lines), you could use an equation like this:

stddev = fwhm / 2.35482004503
amplitude = power / (stddev * sqrt(2 * pi))

Or the other way round, to convert pahfit amplitude to power, I would suggest using astropy units, something like this

import math as m
from astropy import units as u
# the power unit you want
power_unit = u.W / u.m**2
# need central wavelength for conversions
central_wavelength = <pahfit wavelength> * u.micron
# convert fwhm to stddev
stddev = <pahfit fwhm> * u.micron / 2.35482004503
# convert per-frequency amplitude to per-wavelength amplitude
amplitude_per_wav = (<pahfit power> * u.Jy).to(power_unit / u.micron, equivalencies=u.spectral_density(central_wavelength))
# multiply amplitude per wavelength (m) by width in wavelength (micron)
power = (amplitude_per_wav * stddev * np.sqrt(2 * m.pi)).to(power_unit)

Note that the current state of the units is a temporary inconsistency. Standardizing the units is the top priority for what's next in the development. At the input, we will convert everything to fixed internal units, and at the output, things will be converted back to the user's chosen units. This should solve any confusion in the future.

drvdputt commented 1 year ago

I'm not sure what your plotting issue could be. If you send me your saved model I could take a quick look. Or you should be able to work around this by using the plotting interactively in Python. I suggest adding return fig, axs to the end of pahfit.model.Model.plot. Then you could change the limits using e.g. axs[0].set_xlim(5, 30)

sophfb commented 1 year ago

Thank you for the help with the units! Your fix worked perfectly and my values agree with the ones I was comparing against. I was wondering, however, if there was any way to extract the error in the amplitudes. We've been looking through the output tables and the example notebooks and can't find a way to access the errors.

I also tried the plotting work around but ended up with this error:

cannot unpack non-iterable NoneType object

This could be the way I set it up which was like this:

fig1, axs1 = model_fls1.plot(spec_fls1) axs1[0].set_xlim(5, 30)

Would that be the correct way?

drvdputt commented 1 year ago

I also tried the plotting work around

If you edit the code, you probably need to run pip install again. Using an editable install works around this (pip install -e . in the PAHFIT root directory).

any way to extract the error in the amplitudes

Not yet. There will be some form of error (and parameter covariance) estimation eventually, but I'm not sure about the time frame.

jdtsmith commented 1 year ago

On that topic, we should try to get a plan together for the internal API @drvdputt. Connect next week?

sophfb commented 1 year ago

Hello, a member of my research group has also been working with PAHfit and has a question:

Why are the fluxes multiplied by (1+z) in line 198 of model.py?

Here's lines 196-200 in model.py for reference:

transform observed wavelength to "physical" wavelength xz = x / (1 + z) # wavelength shorter yz = y (1 + z) # energy higher uncz = unc (1 + z) # uncertainty scales with flux return x, y, unc, xz, yz, uncz

drvdputt commented 1 year ago

I might have made a mistake here. I just double checked, and found this useful write up https://ui.adsabs.harvard.edu/abs/2002astro.ph.10394H/abstract.

Main takeaway: it's either multiply by (1+z) or divide by (1+z), depending on the units (per wavelength or per frequency).

I also checked shift_redshift_to from specutils, but that function doesn't change the flux at all.

I'll have to rethink this later. Thanks for checking.

sophfb commented 1 year ago

Thank you for the quick response! I have another question, is there a way to allow the atomic line widths to vary in the fits? In the fits I am getting, the best fit lines are way narrower than what the data shows.

drvdputt commented 1 year ago

Try using the use_instrument_fwhm=False option of fit(). This is a global override, and the behavior is explained in the docstring. The line width will be fixed to whatever is in the features table, unless bounds on the FWHM are provided, in that case it will be fit. You can either edit the features table directly in Python (after it was loaded), or edit the science pack YAML file and set the FWHM to a good intial guess and add a bounds section (the notebook explains how to add bounds).

jdtsmith commented 1 year ago

Better would be to post details of the fits and instrument chose. Lines should not be much narrower than the instrument model sets!

sophfb commented 1 year ago

fls 1.pdf

Here is the first object in our data set. As you can see, the fits for the atomic lines are narrower than the actual data, and this is with use_instrument_fwhm=False and the jwst instrument pack.

drvdputt commented 1 year ago

Can you print out or export the features table after the fit? Or check the bounds for the width?

I suggest printing model.features[model.features['kind'] == 'line']

After "guess", I get this in the demo notebook:

name | group | kind | temperature | tau | wavelength | power | fwhm | model | geometry
-- | -- | -- | -- | -- | -- | -- | -- | -- | --

H2_S(7) | H2_lines | line | <n/a> | <n/a> | 5.511 (Fixed) | 0 (0, inf) | 0.001456 (Fixed) | -- | --
H2_S(6) | H2_lines | line | <n/a> | <n/a> | 6.109 (Fixed) | 0 (0, inf) | 0.001699 (Fixed) | -- | --
H2_S(5) | H2_lines | line | <n/a> | <n/a> | 6.909 (Fixed) | 0 (0, inf) | 0.00197 (Fixed) | -- | --
H2_S(4) | H2_lines | line | <n/a> | <n/a> | 8.026 (Fixed) | 0 (0, inf) | 0.002309 (Fixed) | -- | --
H2_S(3) | H2_lines | line | <n/a> | <n/a> | 9.665 (Fixed) | 0 (0, inf) | 0.002868 (Fixed) | -- | --
H2_S(2) | H2_lines | line | <n/a> | <n/a> | 12.28 (Fixed) | 0 (0, inf) | 0.004121 (Fixed) | -- | --
H2_S(1) | H2_lines | line | <n/a> | <n/a> | 17.03 (Fixed) | 0 (0, inf) | 0.006681 (Fixed) | -- | --
H2_S(0) | H2_lines | line | <n/a> | <n/a> | 28.22 (Fixed) | 0 (0, inf) | 0 (Fixed) | -- | --
[ArII] | ionic_lines | line | <n/a> | <n/a> | 6.985 (Fixed) | 0 (0, inf) | 0.001973 (Fixed) | -- | --
[ArIII] | ionic_lines | line | <n/a> | <n/a> | 8.991 (Fixed) | 0 (0, inf) | 0.002842 (Fixed) | -- | --
[SIV] | ionic_lines | line | <n/a> | <n/a> | 10.51 (Fixed) | 0 (0, inf) | 0.00336 (Fixed) | -- | --
[NeII] | ionic_lines | line | <n/a> | <n/a> | 12.81 (Fixed) | 0 (0, inf) | 0.004302 (Fixed) | -- | --
[NeIII] | ionic_lines | line | <n/a> | <n/a> | 15.55 (Fixed) | 0 (0, inf) | 0.00747 (Fixed) | -- | --
[SIII]_18 | ionic_lines | line | <n/a> | <n/a> | 18.71 (Fixed) | 0 (0, inf) | 0.01131 (Fixed) | -- | --
[OIV] | ionic_lines | line | <n/a> | <n/a> | 25.91 (Fixed) | 0 (0, inf) | 0.01788 (Fixed) | -- | --
[FeII]_26 | ionic_lines | line | <n/a> | <n/a> | 25.99 (Fixed) | 0 (0, inf) | 0.018 (Fixed) | -- | --
[SIII]_33 | ionic_lines | line | <n/a> | <n/a> | 33.48 (Fixed) | 0 (0, inf) | 0 (Fixed) | -- | --
[SiII] | ionic_lines | line | <n/a> | <n/a> | 34.82 (Fixed) | 0 (0, inf) | 0 (Fixed) | -- | --
[FeII]_35 | ionic_lines | line | <n/a> | <n/a> | 35.35 (Fixed) | 0 (0, inf) | 0 (Fixed) | -- | --

So it's no surprise that the widths aren't fit, because they're all fixed. You can either:

Once these values have been set, the fwhm will be fit between the given bounds. The bounds are None be default, meaning that the fwhms are fixed during the fit.

sophfb commented 1 year ago

So I am currently setting the FWHM value and bounds like this:

model_fls1_edit.features.loc['[ArII]']['fwhm'][1] = 0.053 model_fls1_edit.features.loc['[ArII]']['fwhm'][2] = [0.0265, 0.0795]

But I am getting this error:

TypeError: float() argument must be a string or a number, not 'list'

Is there a different way that the bounds are set in the features table as opposed to in the yaml file?

drvdputt commented 1 year ago

The correct syntax for the table is not entirely intuitive (which is why I suggested something like set_bounds before for the Features class).

If you print out the table element model_fls1_edit.features.loc['[ArII]']['fwhm'], you will see that it is a 3-tuple. The first value (index 0) is the value, and the bounds are located at index 1 and 2. Here is a correction to set the values you want:

model_fls1_edit.features.loc['[ArII]']['fwhm'][0] = 0.053
model_fls1_edit.features.loc['[ArII]']['fwhm'][1:] = [0.0265, 0.0795]
sophfb commented 1 year ago

Thank you! That definitely fixed that issue, I am having a different issue however. It seems like after I change the fwhm's, it saves them but then after I do the fit it reverts back. If I change [ArII]'s fwhm values to what I did above, do the fit and then print the fwhm again, it gives me:

[0.0021803223044520553 -- --]

which is what it was before.

Do you know why this might be?

drvdputt commented 1 year ago

You also need to disable the FWHM model in the fit() call, I think, it is True be default. There's an optional keyword for that: use_instrument_fwhm. Note that setting this to False will disable the FWHM determination for ALL the lines. So ideally you'd set bounds for all the lines before calling fit.

It is kind of annoying that this needs to be set both in guess and fit though. Especially if you've already gone through the effort of manually adding bounds to the lines. Fixing things like this will become more straightforwards once we rewrite the (astropy) model construction method; currently this is done by the old code in base.py, which is always confusing to make change to.

jdtsmith commented 1 year ago

@drvdputt I think we talked about this in Baltimore. If bounds are set in the Features table, the instrument pack should leave them alone. But @sophfb if the instrument pack fails to fit the lines well, this could be a sign that it needs improvement. Can I ask you to open a new issue with a good descriptive name detailing just that problem, maybe with a zoom in on the problematic area of fit? I'll close this for now as we try to keep to one problem/feature = one issue.