plot_spectrum_posterior failing is TypeError (and a few other issues)

ACCarnall / bagpipes

Bagpipes is a state of the art code for generating realistic model galaxy spectra and fitting these to spectroscopic and photometric observations. Users should install with pip, not by cloning the repository.

http://bagpipes.readthedocs.io

GNU General Public License v3.0

78 stars 41 forks source link

plot_spectrum_posterior failing is TypeError (and a few other issues) #2

Closed kevinhainline closed 5 years ago

kevinhainline commented 5 years ago

Hello! I'm excited to be using BAGPIPES, although I've discovered that plot_spectrum_posterior is currently breaking whenever I try to use it:

Traceback (most recent call last):
  File "bagpipes_christopher_fitting_test.py", line 112, in <module>
    fig = fit.plot_spectrum_posterior()
  File "/Users/knh/anaconda2/lib/python2.7/site-packages/bagpipes/fitting/fit.py", line 184, in plot_spectrum_posterior
    return plotting.plot_spectrum_posterior(self, show=show, save=save)
  File "/Users/knh/anaconda2/lib/python2.7/site-packages/bagpipes/plotting/plot_spectrum_posterior.py", line 30, in plot_spectrum_posterior
    add_photometry_posterior(fit, ax[-1], zorder=2, y_scale=y_scale[-1])
  File "/Users/knh/anaconda2/lib/python2.7/site-packages/bagpipes/plotting/plot_spectrum_posterior.py", line 70, in add_photometry_posterior
    zorder=zorder-1, color="navajowhite", linewidth=0)
  File "/Users/knh/anaconda2/lib/python2.7/site-packages/matplotlib/__init__.py", line 1867, in inner
    return func(ax, *args, **kwargs)
  File "/Users/knh/anaconda2/lib/python2.7/site-packages/matplotlib/axes/_axes.py", line 5095, in fill_between
    y1 = ma.masked_invalid(self.convert_yunits(y1))
  File "/Users/knh/anaconda2/lib/python2.7/site-packages/numpy/ma/core.py", line 2363, in masked_invalid
    condition = ~(np.isfinite(a))
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

In addition, right now I probably shouldn't be trying to use out_units='mujy' either, right? If you do that, the flux units on the y-axis when using galaxy.plot() are incorrect, and the fits don't seem to work either.

Finally, is there an issue with large redshifts BAGPIPES will consider because of the IGM model? If you try to look at galaxies at z > 10, the code continually returns an index error on igm_model.py:

Traceback (most recent call last):
  File "_ctypes/callbacks.c", line 315, in 'calling callback function'
  File "build/bdist.macosx-10.6-x86_64/egg/pymultinest/run.py", line 213, in loglike
  File "/Users/knh/anaconda2/lib/python2.7/site-packages/bagpipes/fitting/fitted_model.py", line 132, in lnlike
    self.model_galaxy.update(self.model_components)
  File "/Users/knh/anaconda2/lib/python2.7/site-packages/bagpipes/models/model_galaxy.py", line 224, in update
    self._calculate_full_spectrum(model_components)
  File "/Users/knh/anaconda2/lib/python2.7/site-packages/bagpipes/models/model_galaxy.py", line 315, in _calculate_full_spectrum
    spectrum *= self.igm.trans(model_comp["redshift"])
  File "/Users/knh/anaconda2/lib/python2.7/site-packages/bagpipes/models/igm_model.py", line 44, in trans
    / (config.igm_redshifts[zred_ind]
IndexError: index 1001 is out of bounds for axis 0 with size 1001

...although the code continues to run and produces a fit, just with the recovered redshift at less than 10. Thank you!

ACCarnall commented 5 years ago

Hi Kevin,

Thanks for trying out the code!

I think your first issue is being caused by a small mistake on my part in the most recent release, sorry about that. In the file plotting/plot_spectrum_posterior.py line 81 is commented, try uncommenting it and indenting line 82. Basically, if you try to plot the photometry posterior as a raster (necessary for avoiding 100MB plot files) and no part of it is actually on the axes an error is thrown by matplotlib. Line 81 was meant to catch this but I must have disabled it for some reason when I was testing stuff.

If you use out_units="mujy" when you set up the galaxy object you're fitting then this will probably fail as you say, I should probably make this clearer. I've been meaning to implement symbolic units in some form, but haven't got round to it yet. This is probably the underlying reason for the photometry posterior not being on the axes, as described above.

By default the code is set up to generate and fit models up to a maximum redshift of 10. This is basically a speed consideration, as the max redshift determines the rest-frame wavelength region of the models which might be needed, and hence has to be sampled at higher resolution, meaning larger arrays have to be manipulated. In order to increase the max redshift go into config.py and change the max_redshift parameter to the desired value, along with the upper limit of the igm_redshifts array. You will then need to delete the models/grids/d_igm_grid_inoue14.fits file, in order to force the code to re-calculate the IGM attenuation grid.

Let me know if any of these problems don't go away when you try the stuff above!

kevinhainline commented 5 years ago

Thanks for your reply!

For the first issue, I uncommented line 81 and indented line 82, and the plot_spectrum_posterior worked fine, thank you! So, that's great.

As for out_units="mujy", I'll just avoid using that for the time being.

Now, for the final point, I went to the config.py file and changed it to this:

# Sets the maximum redshift the code is set up to calculate models for.
max_redshift = 15.

and

# Redshift points for the IGM grid.
igm_redshifts = np.arange(0.0, 15.01, 0.01)

And I've also deleted the d_igm_grid_inoue14.fits file. When running the code on an object with spec-z = 13.51, and fit_instructions["redshift"] set to 0 - 15:

fit_instructions["redshift"] = (0., 15.)  # Vary observed redshift from 0 to 15

While running the fit I get the same pymultinest errors:

/Users/knh/.local/lib/python2.7/site-packages/pymultinest-2.7-py2.7.egg/pymultinest/run.pyc in loglike(cube, ndim, nparams, lnew, nullcontext)
    211                         if Prior:
    212                                 Prior(cube, ndim, nparams)
--> 213                         return LogLikelihood(cube, ndim, nparams)
    214 
    215        def dumper(nSamples,nlive,nPar,

/Users/knh/anaconda2/lib/python2.7/site-packages/bagpipes/fitting/fitted_model.pyc in lnlike(self, x, ndim, nparam)
    130                                              index_list=self.galaxy.index_list)
    131 
--> 132         self.model_galaxy.update(self.model_components)
    133 
    134         # Return zero likelihood if SFH is older than the universe.

/Users/knh/anaconda2/lib/python2.7/site-packages/bagpipes/models/model_galaxy.pyc in update(self, model_components)
    222 
    223         else:
--> 224             self._calculate_full_spectrum(model_components)
    225             self._calculate_uvj_mags()
    226 

/Users/knh/anaconda2/lib/python2.7/site-packages/bagpipes/models/model_galaxy.pyc in _calculate_full_spectrum(self, model_comp)
    313                                                               gamma)
    314 
--> 315         spectrum *= self.igm.trans(model_comp["redshift"])
    316 
    317         # Convert from luminosity to observed flux at redshift z.

/Users/knh/anaconda2/lib/python2.7/site-packages/bagpipes/models/igm_model.pyc in trans(self, redshift)
     42 
     43         zred_fact = ((redshift - config.igm_redshifts[zred_ind-1])
---> 44                      / (config.igm_redshifts[zred_ind]
     45                         - config.igm_redshifts[zred_ind-1]))
     46 

IndexError: index 1001 is out of bounds for axis 0 with size 1001

And the code seemingly never finishes running. If I try to limit the redshift that I explore for the galaxy:

fit_instructions["redshift"] = (10., 15.)  # Vary observed redshift from 0 to 15

It still never seems to finish.

This also happens if I change the fit_instruction redshift range:

fit_instructions["redshift"] = (0., 10.)  # Vary observed redshift from 0 to 15

kevinhainline commented 5 years ago

Also, how does BAGPIPES deal with negative fluxes? I recognize that very negative fluxes are treated as no data in a particular filter, but if we have a non-detection in a given filter with a low/negative flux, does BAGPIPES recognize this?

ACCarnall commented 5 years ago

Hi Kevin,

I just had a play around with this and remembered what's going on with the IGM table. The config file is supposed to contain all the reference values the code needs so that it's easy to change them. However in the case of the IGM table the config file also contains a check to make sure the table exists. This leads to circular import problems if the IGM-table-making code then refers back to the config file, so the IGM-table-making code contains its own version of igm_redshifts, which also needs to be changed.

This is clearly something I should fix, but it's going to require a bit more thought to do properly than I can give it right now. In the mean time, you can go into models/making/igm_inoue2014.py line 225 and change the upper limit on z_vals to the same as you changed the upper limit on igm_redshifts to in the config file. Then delete the IGM table and try again, I just ran a quick test case and this fixed the problem, so hopefully it'll work for you too.

Negative fluxes are treated by the code in exactly the same way as positive ones, so if you have something like a -0.5 sigma detection and you trust both the flux and error then everything will work fine. If you have something like a -10 sigma detection then clearly the error is either underestimated or there's some systematic that hasn't been accounted for. The code will however treat this as a real -10 sigma detection, and since it won't be able to generate models with negative fluxes this will drive the code to strongly favour models with very little flux in that filter. Perhaps this is what you want, but it's always best to try to properly characterise your uncertainties. One side-effect if you put in a -10 sigma detection is your quality of fit will always be poor, so you'll end up with underestimated uncertainties on your derived parameters.

kevinhainline commented 5 years ago

Your suggestion worked for the fit! Here's a corner plot for a z ~ 13.1 galaxy fit with BAGPIPES to some mock HST+NIRCam data:

Attempting to plot the posterior spectrum lead to an error when trying to use ax.fill_between in plot_spectrum_posterior.py, however. But for now, I can fit and return results for galaxies at z > 10!

In [5]: fig = fit.plot_spectrum_posterior()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-b560b5f9543f> in <module>()
----> 1 fig = fit.plot_spectrum_posterior()

/Users/knh/anaconda2/lib/python2.7/site-packages/bagpipes/fitting/fit.pyc in plot_spectrum_posterior(self, show, save)
    182 
    183     def plot_spectrum_posterior(self, show=False, save=True):
--> 184         return plotting.plot_spectrum_posterior(self, show=show, save=save)
    185 
    186     def plot_calibration(self, show=False, save=True):

/Users/knh/anaconda2/lib/python2.7/site-packages/bagpipes/plotting/plot_spectrum_posterior.pyc in plot_spectrum_posterior(fit, show, save)
     28 
     29     if fit.galaxy.photometry_exists:
---> 30         add_photometry_posterior(fit, ax[-1], zorder=2, y_scale=y_scale[-1])
     31 
     32     if save:

/Users/knh/anaconda2/lib/python2.7/site-packages/bagpipes/plotting/plot_spectrum_posterior.pyc in add_photometry_posterior(fit, ax, zorder, y_scale)
     68     ax.plot(log_wavs, spec_post[:, 1], color="navajowhite", zorder=zorder-1)
     69     ax.fill_between(log_wavs, spec_post[:, 0], spec_post[:, 1],
---> 70                     zorder=zorder-1, color="navajowhite", linewidth=0)
     71 
     72     phot_post = np.percentile(fit.posterior.samples["photometry"],

/Users/knh/anaconda2/lib/python2.7/site-packages/matplotlib/__init__.pyc in inner(ax, *args, **kwargs)
   1865                         "the Matplotlib list!)" % (label_namer, func.__name__),
   1866                         RuntimeWarning, stacklevel=2)
-> 1867             return func(ax, *args, **kwargs)
   1868 
   1869         inner.__doc__ = _add_data_doc(inner.__doc__,

/Users/knh/anaconda2/lib/python2.7/site-packages/matplotlib/axes/_axes.pyc in fill_between(self, x, y1, y2, where, interpolate, step, **kwargs)
   5093         # Convert the arrays so we can work with them
   5094         x = ma.masked_invalid(self.convert_xunits(x))
-> 5095         y1 = ma.masked_invalid(self.convert_yunits(y1))
   5096         y2 = ma.masked_invalid(self.convert_yunits(y2))
   5097 

/Users/knh/anaconda2/lib/python2.7/site-packages/numpy/ma/core.pyc in masked_invalid(a, copy)
   2361         cls = type(a)
   2362     else:
-> 2363         condition = ~(np.isfinite(a))
   2364         cls = MaskedArray
   2365     result = a.view(cls)

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Thank you so much for your help.

ACCarnall commented 5 years ago

Cool, looks nice!

I don't think I can diagnose that without seeing what ends up in those arrays. It's definitely not a problem others are having, did you manage to get posterior spectrum plots out for other objects? Can you print out the arrays which are being passed into fill_between and see what they contain and their type?

One thing to be wary of here is that the trace of the spectrum that gets added to the background of the photometry posterior plots assumes all draws from the spectrum posterior have the posterior median redshift, basically because otherwise you end up with a smeared out mess. This is a reasonable assumption if the redshift posterior is tightly constrained, but works increasingly poorly the larger the redshift uncertainty. With that in mind it might be worth just commenting lines 67-70 for this case, you could replace them with a single line plot of the posterior median spectrum for example.

kevinhainline commented 5 years ago

I was able to do most everything providing that I just comment out this line:

    ax.fill_between(log_wavs, spec_post[:, 0], spec_post[:, 1],
                    zorder=zorder-1, color="navajowhite", linewidth=0)

in the add_photometry_posterior() function in plot_spectrum_posterior.py. For one of my fitting objects, here are the arrays, the minimum and maximum values, and the types:

log (wavs) =

[1.14932333 1.16004719 1.17077106 ... 9.134394   9.14511787 9.15584173]
1.1493233259703228 9.155841733712348
<type 'numpy.ndarray'>

spec_post[:,0] =

[0.0 0.0 0.0 ... 2.1342962987396924e-12 1.891418666052292e-12
 4.5455692977009037e-17]
0.0 2.31580415326
<type 'numpy.ndarray'>

spec_post[:,1] =

[0.0 0.0 0.0 ... 1.9167780399309555e-11 1.6986531525781076e-11
 1.218203581721975e-16]
0.0 3.36744904039
<type 'numpy.ndarray'>

kevinhainline commented 5 years ago

One more question that I can't seem to find - what is the IMF that you assume throughout?

ACCarnall commented 5 years ago

The standard BC03 models implemented in bagpipes use the Kroupa IMF.

Not entirely sure what's going on with the plotting thing, from the information you sent I can't see an obvious reason for it to fail.

ACCarnall commented 5 years ago

In the new version (0.7.10) changing the max_redshift parameter in the config file will automatically cause a new IGM table to be generated. In the future it would be good to have a function for changing max_redshift without going into the source code, along with a few other similar variables in config.py

ACCarnall commented 5 years ago

Hi Kevin, I figured out the problem you were having with the plots. I don't fully understand why it was happening (something to do with types in numpy), but I was able to fix it in v0.7.11 by explicitly setting the array elements to be floats.

kevinhainline commented 5 years ago

Oh, this is great! Thank you for looking into this.

In the mean time, I have two questions about running BAGPIPES that maybe you can help answer.

1) I'm not having a lot of success running the code on high-redshift galaxies. If I fit a mock galaxy at z ~ 6 with a simple delayed SFH:

# simple fit. 
delayed = {}                                  # Delayed model star-formation history component
delayed["age"] = (0.1, 15.)                   # Vary age between 100 Myr and 15 Gyr. In practice 
                                          # the code automatically limits this to the age of
                                          # the Universe at the observed redshift.

delayed["tau"] = (0.01, 10.)                   # Vary tau between 300 Myr and 10 Gyr
delayed["massformed"] = (1., 15.)             # vary log_10(M*/M_solar) between 1 and 15
delayed["metallicity"] = (0., 2.5)            # vary Z between 0 and 2.5 Z_oldsolar

dust = {}                                 # Dust component
dust["type"] = "Calzetti"                 # Define the shape of the attenuation curve
dust["Av"] = (0., 2.)                     # Vary Av between 0 and 2 magnitudes

nebular = {}                              # Nebular Emission
nebular["logU"] = -3.

fit_instructions = {}                     # The fit instructions dictionary
fit_instructions["redshift"] = (0., 15.)  # Vary observed redshift from 0 to 15
fit_instructions["delayed"] = delayed   
fit_instructions["dust"] = dust

fit_instructions["redshift_prior"] = "Gaussian"  # From looking at the spectrum in Example 2 it's
fit_instructions["redshift_prior_mu"] = 1.0      # clear that this  object is at around z = 1. We'll 
fit_instructions["redshift_prior_sigma"] = 0.25  # include that information with a broad Gaussian
                                         # prior centred on redshift 1. Parameters of priors
                                         # are passed starting with "parameter_prior_".
fit_instructions["nebular"] = nebular

The result seems pretty confident that the Lyman break is the Balmer break:

This seems to happen for all of the high-redshift galaxies that I'm looking at. Here's another mock galaxy at z ~ 4.9:

I assume that this problem is more of a "I'm not exploring the correct parameter space" issue, but I thought I'd ask.

2) I also tried to double check that I was using BAGPIPES correctly by generating a sample of galaxies from JAGUAR (Williams et al. 2018), which were created using BEAGLE. I took each object, pulled its galaxy parameters, and tried to replicate this with BAGPIPES. Here's a comparison of the resulting JWST NIRCam F200W magnitude:

This discrepancy was why I initially asked about the IMF. I think that perhaps this is an issue with the BAGPIPES "massformed" vs. the JAGUAR "stellar mass" parameter, maybe? I would assume that massformed should generally be greater than stellar mass, so by setting a galaxy's massformed to the stellar mass of a comparison galaxy would lead to a fainter final BAGPIPES galaxy, which is what we see. But would the offset be this severe?

Anyway, thank you, as always, for your help!

ACCarnall commented 5 years ago

Hi Kevin, in the code snippet you posted you're fitting a Gaussian redshift prior centred on z = 1 with st. dev. = 0.25, so you're massively down-weighting z=6 solutions, and Lyman break as Balmer break is about the best it can do at z~1 i suspect. Try a uniform redshift prior (just delete all the redshift prior stuff) and I suspect you'll get a much more sensible answer.

You're absolutely right about the magnitude difference, I'd expect the fluxes you get out of bagpipes by setting formed mass to the stellar masses out of beagle to be low by about 30%. In order to fix this:

First generate the model as you have been, setting the formed mass to the stellar mass from beagle
Extract the bagpipes stellar mass (model.sfh.stellar_mass), which will be low by ~0.2-0.25 dex compared to the beagle mass
Take the ratio between the beagle mass and model.sfh.stellar_mass, and either multiply the output fluxes by this ratio, or re-generate the model using the beagle mass plus this difference.

Hope that makes sense!

kevinhainline commented 5 years ago

Adam, Thank you for your swift reply and for spotting the EXTREMELY OBVIOUS and VERY DUMB mistake I made.

When running BAGPIPES on the same objects as I discussed in the previous post, but without that stupid redshift prior, things look a lot better now:

Secondly, if I go and scale the input stellar masses to the massformed values with the method you outlined, I get these results:

They're much better, and most of the differences can be chalked up to the different treatment of dust between BAGPIPES and BEAGLE. But for now, I think that I've set myself up to take a closer look at some of these fitting results for a larger sample. Thank you!

Vladan1986 commented 1 year ago

Hi Adam,

I produce the same error when using galaxy.plot() and again with fit.plot_spectrum_posterior():

`

TypeError Traceback (most recent call last) /tmp/ipykernel_19208/3992200912.py in 2 photometry_exists = False, 3 ) ----> 4 fig = galaxy.plot()

~/anaconda3/lib/python3.9/site-packages/bagpipes/input/galaxy.py in plot(self, show) 226 227 def plot(self, show=True): --> 228 return plotting.plot_galaxy(self, show=show)

~/anaconda3/lib/python3.9/site-packages/bagpipes/plotting/plot_galaxy.py in plot_galaxy(galaxy, show, return_y_scale) 32 spec_ax = plt.subplot(gs[0, 0]) 33 ---> 34 y_scale_spec = add_spectrum(galaxy.spectrum, spec_ax) 35 36 if galaxy.photometry_exists:

~/anaconda3/lib/python3.9/site-packages/bagpipes/plotting/plot_spectrum.py in add_spectrum(spectrum, ax, x_ticks, zorder, z_non_zero, y_scale, ymax, color, lw, label, alpha) 51 #print(spectrum[:, 0]) 52 #print(lower, upper) ---> 53 ax.fill_between(spectrum[:, 0], lower, upper, color="dodgerblue", 54 zorder=zorder-1, alpha=0.75, linewidth=0) 55

~/anaconda3/lib/python3.9/site-packages/matplotlib/init.py in inner(ax, data, *args, kwargs) 1359 def inner(ax, *args, data=None, *kwargs): 1360 if data is None: -> 1361 return func(ax, map(sanitize_sequence, args), kwargs) 1362 1363 bound = new_sig.bind(ax, *args, **kwargs)

~/anaconda3/lib/python3.9/site-packages/matplotlib/axes/_axes.py in fill_between(self, x, y1, y2, where, interpolate, step, kwargs) 5384 def fill_between(self, x, y1, y2=0, where=None, interpolate=False, 5385 step=None, kwargs): -> 5386 return self._fill_between_x_or_y( 5387 "x", x, y1, y2, 5388 where=where, interpolate=interpolate, step=step, **kwargs)

~/anaconda3/lib/python3.9/site-packages/matplotlib/axes/_axes.py in _fill_between_x_or_y(self, ind_dir, ind, dep1, dep2, where, interpolate, step, **kwargs) 5290 5291 # Handle united data, such as dates -> 5292 ind, dep1, dep2 = map( 5293 ma.masked_invalid, self._process_unit_info( 5294 [(ind_dir, ind), (dep_dir, dep1), (dep_dir, dep2)], kwargs))

~/anaconda3/lib/python3.9/site-packages/numpy/ma/core.py in masked_invalid(a, copy) 2363 cls = type(a) 2364 else: -> 2365 condition = ~(np.isfinite(a)) 2366 cls = MaskedArray 2367 result = a.view(cls)

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'' `

I found that the error appears when the spectrum is ~ < 10^-19 erg s^-1 cm^-2 A^-1, and for a brighter spectrum it disappears ... Maybe it is related to the plot limits and/or some values going negative, but I have not been able to figure it out.

I am using a newer version of bagpipes.

I attached a code which produces this error

bagpipes_model-test1.zip