PAHFIT / pahfit

Model Decomposition for Near- to Mid-Infrared Spectroscopy of Astronomical Sources
https://pahfit.readthedocs.io/
18 stars 26 forks source link

PAHFIT 2.5 version : interpolate fitting problem for Spitzer short-high / long-high spectra #260

Closed fpoidevin closed 2 months ago

fpoidevin commented 1 year ago

Hi everyone.

I would like to analyse spitzer high resolution spectra with PAHFIT but I’m encountering a problem (see end of message) at the interpolation level. I’m not sure how to fix this so any help or information would be much appreciated.

More details below:

I’m using python 3.8 and pahfit 2.5 on ventura macOS 13.2.1.

The spectrum I'm using to make a first test is from the cassis database (it can be found under AORkey = 25679616):

https://cassis.sirtf.com/atlas/cgi/aorkey.py?aorkey=%2025679616

*) To read the data I’m using the following :

data = ascii.read('cassis_tbl_full_25679616_1.tbl')

header = ascii.read(data.meta['comments'], delimiter='=',format='no_header')

x=data['wavelength'] y=data['flux'] # F_lambda : flux density (F_lambda) yerr=data['error']

ww=np.array(x)u.AA ff=np.array(y)u.Jy err_ff=StdDevUncertainty(yerr) spec = Spectrum1D(spectral_axis=ww, flux=ff, uncertainty=err_ff)

set instrument information, and redshift if > 0

spec.meta['instrument'] = 'spitzer.irs.sh' spec.set_redshift_to(0)

from pahfit.model import Model model = Model.from_yaml('classic.yaml')

*) The problem appears when I’m calling model.guess(spec).

Below is the screen error message:


model.guess(spec) /Users/frederickpoidevin/opt/anaconda3/envs/pahfit/lib/python3.8/site-packages/numpy/ma/core.py:1163: RuntimeWarning: overflow encountered in divide result = self.f(da, db, *args, **kwargs)

ValueError Traceback (most recent call last) Cell In[6], line 1 ----> 1 model.guess(spec)

File ~/opt/anaconda3/envs/pahfit/lib/python3.8/site-packages/pahfit/model.py:174, in Model.guess(self, spec, redshift) 172 # remake param_info to make sure we have any feature updates from the user 173 param_info = self._kludge_param_info(inst, z) --> 174 param_info = PAHFITBase.estimate_init(xz, yz, param_info) 175 self._backport_param_info(param_info)

File ~/opt/anaconda3/envs/pahfit/lib/python3.8/site-packages/pahfit/base.py:794, in PAHFITBase.estimate_init(obs_x, obs_y, param_info) 792 else: # if min(obs_x) > 5, use 5.5 um 793 lam = 5.5 --> 794 amp_guess = sp(lam) / bb(lam) 795 param_info[0]["amps"][i] = amp_guess 797 elif fix is False:

File ~/opt/anaconda3/envs/pahfit/lib/python3.8/site-packages/scipy/interpolate/_polyint.py:80, in _Interpolator1D.call(self, x) 59 """ 60 Evaluate the interpolant 61 (...) 77 78 """ 79 x, x_shape = self._prepare_x(x) ---> 80 y = self._evaluate(x) 81 return self._finish_y(y, x_shape)

File ~/opt/anaconda3/envs/pahfit/lib/python3.8/site-packages/scipy/interpolate/_interpolate.py:752, in interp1d._evaluate(self, x_new) 750 y_new = self._call(self, x_new) 751 if not self._extrapolate: --> 752 below_bounds, above_bounds = self._check_bounds(x_new) 753 if len(y_new) > 0: 754 # Note fill_value must be broadcast up to the proper size 755 # and flattened to work here 756 y_new[below_bounds] = self._fill_value_below

File ~/opt/anaconda3/envs/pahfit/lib/python3.8/site-packages/scipy/interpolate/_interpolate.py:786, in interp1d._check_bounds(self, x_new) 784 if self.bounds_error and above_bounds.any(): 785 above_bounds_value = x_new[np.argmax(above_bounds)] --> 786 raise ValueError("A value ({}) in x_new is above " 787 "the interpolation range's maximum value ({})." 788 .format(above_bounds_value, self.x[-1])) 790 # !! Should we emit a warning if some values are out of bounds? 791 # !! matlab does not. 792 return below_bounds, above_bounds

ValueError: A value (0.100990151) in x_new is above the interpolation range's maximum value (0.0019499319000000003).

In [7]:


Ultimately I would like to use pahfit on a full spectrum i.e. short-high + long-high (s-h+l-h) combined in one spectrum but to make it simple I’m making tests with a s-h spectrum.

For a (s-h+l-h) I’m not sure what would be the approriate field to enter in: spec.meta['instrument']

Cheers, Fred

drvdputt commented 1 year ago

For a (s-h+l-h) I’m not sure what would be the approriate field to enter in: spec.meta['instrument']

This will be documented eventually... but you can either pass a list of strings or a wildcard. For your case, ['spitzer.irs.sh', 'spitzer.irs.lh'], or 'spitzer.irs.?h'.

And the error of the guess function has to do with the initial guesses for the blackbody components. The old guess function has a hardcoded wavelength somewhere (5), and it fails if your spectrum does not contain that wavelength. Should be fixed after merging #242 . I just rebased the branch of this pull request onto the v2.5 master, so you could clone drvdputt/new_guess as a workaround.

fpoidevin commented 1 year ago

Thank you very much for rebasing the branch of this pull request onto the v2.5 master. I cloned it using the following command line: pip install git+https://github.com/drvdputt/pahfit.git@new_guess but still the same problem is appearing for some reason. I'm not sure I understand why.

Error message below in case it could help understanding what's going on:

ValueError Traceback (most recent call last) Cell In[3], line 1 ----> 1 model.guess(spec)

File ~/opt/anaconda3/envs/pahfit/lib/python3.8/site-packages/pahfit/model.py:220, in Model.guess(self, spec, redshift, integrate_line_flux, calc_line_fwhm) 216 amp_guess = sp(w) / bb(wsafe) 218 return amp_guess --> 220 loop_over_non_fixed("starlight", "tau", starlight_guess) 222 # count number of blackbodies in the model 223 nbb = len(self.features[self.features["kind"] == "dust_continuum"])

File ~/opt/anaconda3/envs/pahfit/lib/python3.8/site-packages/pahfit/model.py:200, in Model.guess..loop_over_non_fixed(kind, parameter, estimate_function, force) 198 row = self.features[row_index] 199 if not bounded_is_fixed(row[parameter]) or force: --> 200 guess_value = estimate_function(row) 201 # print(f"{row['name']}: setting {parameter} to {guess_value}") 202 self.features[row_index][parameter][0] = guess_value

File ~/opt/anaconda3/envs/pahfit/lib/python3.8/site-packages/pahfit/model.py:211, in Model.guess..starlight_guess(row) 207 w = wmin + 0.1 # the wavelength used to compare 208 if w < 5: 209 # wavelength is short enough to not have numerical 210 # issues. Evaluate both at w. --> 211 amp_guess = sp(w) / bb(w) 212 else: 213 # wavelength too long for stellar BB. Evaluate BB at 214 # 5 micron, and spectrum data at minimum wavelength. 215 wsafe = 5

File ~/opt/anaconda3/envs/pahfit/lib/python3.8/site-packages/scipy/interpolate/_polyint.py:80, in _Interpolator1D.call(self, x) 59 """ 60 Evaluate the interpolant 61 (...) 77 78 """ 79 x, x_shape = self._prepare_x(x) ---> 80 y = self._evaluate(x) 81 return self._finish_y(y, x_shape)

File ~/opt/anaconda3/envs/pahfit/lib/python3.8/site-packages/scipy/interpolate/_interpolate.py:752, in interp1d._evaluate(self, x_new) 750 y_new = self._call(self, x_new) 751 if not self._extrapolate: --> 752 below_bounds, above_bounds = self._check_bounds(x_new) 753 if len(y_new) > 0: 754 # Note fill_value must be broadcast up to the proper size 755 # and flattened to work here 756 y_new[below_bounds] = self._fill_value_below

File ~/opt/anaconda3/envs/pahfit/lib/python3.8/site-packages/scipy/interpolate/_interpolate.py:786, in interp1d._check_bounds(self, x_new) 784 if self.bounds_error and above_bounds.any(): 785 above_bounds_value = x_new[np.argmax(above_bounds)] --> 786 raise ValueError("A value ({}) in x_new is above " 787 "the interpolation range's maximum value ({})." 788 .format(above_bounds_value, self.x[-1])) 790 # !! Should we emit a warning if some values are out of bounds? 791 # !! matlab does not. 792 return below_bounds, above_bounds

ValueError: A value (0.100990151) in x_new is above the interpolation range's maximum value (0.0019499319000000003).

drvdputt commented 1 year ago

From this

ValueError: A value (0.100990151) in x_new is above the interpolation range's maximum value (0.0019499319000000003).

and this piece of code: w = wmin + 0.1 (the wavelength at which we're evaluating the interpolated spectrum) I see what the problem is. This hardcoded 0.1 micron offset sends w way above the maximum wavelength of your spectrum. So the code won't work for spectra that are less than 0.1 micron wide.

This also shows that your shortest wavelength is 0.00099 micron. That can't be right. I see that in the code of your first post, you use angstrom as the wavelength unit. In the future, this should be converted to micron internally. But at the moment, we just assume that the contents of Spectrum1D are in micron

Although, I don't think your maximum wavelength should be 0.0019 Angstrom either? So check the wavelength, and convert to micron.

fpoidevin commented 1 year ago

Thanks ! Indeed using u.AA was misleading and this short wavelength problem is removed using u.micron.

Once this is done another problem appears when calling model.guess(spec):

model.guess(spec) ...: /Users/frederickpoidevin/opt/anaconda3/envs/pahfit/lib/python3.8/site-packages/numpy/ma/core.py:1163: RuntimeWarning: overflow encountered in divide result = self.f(da, db, *args, **kwargs)

I was able to fix that by removing the pixels such that the flux was NaN in the spectrum. In such a case the model.guess(spec) works without any error problem. On the other hand I'm not sure of the quality of the final fit once model.fit(spec) is called because of the following:

In [3]: model.fit(spec) /Users/frederickpoidevin/opt/anaconda3/envs/pahfit/lib/python3.8/site-packages/pahfit/instrument.py:295: PAHFITWarning: Input wavelength bounds exceed instrument segment(s): Segment(s) cover: 9.97-19.44, wave_bounds: 10.00-19.44 warn(f"\nInput wavelength bounds exceed instrument segment(s){w}{st}", /Users/frederickpoidevin/opt/anaconda3/envs/pahfit/lib/python3.8/site-packages/numpy/ma/core.py:1163: RuntimeWarning: overflow encountered in divide result = self.f(da, db, *args, **kwargs) WARNING: The fit may be unsuccessful; check fit_info['message'] for more information. [astropy.modeling.fitting] Number of calls to function has reached maxfev = 1000.

Any suggestion of a turnaround better than removing the NaN pixels in the input spectrum ?

drvdputt commented 1 year ago

Input wavelength bounds exceed instrument segment(s): Segment(s) cover: 9.97-19.44, wave_bounds: 10.00-19.44

This is just a warning that tells you that the resolution curve (instrument model) is extrapolated a little bit. The current limits are a probably a little too strict. If its only a couple of percent, it shouldn't affect the fit quality.

Number of calls to function has reached maxfev = 1000.

I usually set maxiter to 10000. The model should be fully converged before you check if the result makes sense (unless it's stuck in a very slow convergence situation).

Any suggestion of a turnaround better than removing the NaN pixels in the input spectrum ?

Not really. PAHFIT assumes that the data are clean. The alternative to removing data points (and having to reorganize all the arrays), would be working with masked arrays. I'm not sure what will happen with masked out values in PAHFIT though, so consider that a future feature. The mask probably need to be consistent between the flux and uncertainty arrays of the Spectrum1D too.

Since every spectrum will be different in that regard, cleaning or at least masking should be a task for the user, in my opinion. There could be NaN values, Inf values, negative numbers, zeros in the uncertainty (also a problem!).

Checking and issuing an informative warning would probably help though.

fpoidevin commented 1 year ago

Indeed, using maxiter = 10000 improves the final fit.

Many thanks for all your feedback and inputs. This was very useful.