Starfish-develop / Starfish

Tools for Flexible Spectroscopic Inference
https://starfish.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
70 stars 22 forks source link

ValueError: Querying emulator outside of original parameter range #153

Closed Aseman7 closed 1 year ago

Aseman7 commented 1 year ago

Hello, In prior values, I would like to change the effective temperature and log g distributions is from normal to a uniform distribution to get more reasonable results. But when I select any uniform distribution with different values (in intervals smaller than the range) takes MCMC sampling out of range (ValueError: Querying emulator outside of original parameter range). I would be very grateful if you could suggest to me what range to use for this prior value so that it is not directed outside. Or maybe the problem is caused by another thing that I can't find it. Any other suggestions will be appreciated too.

ranges = [[2000, 4000], [4.5, 5.5], [-1, 1]] # T, logg, Z

priors = { "T": st.uniform(2850, 3050), "logg": st.uniform(4.7, 5.1), "Z": st.uniform(-0.8, 0.8), "global_cov:log_amp": st.norm(-57, 3), "global_cov:log_ls": st.uniform(0, 10), }

mileslucas commented 1 year ago

Can you take a look at the emulator and try loading some spectra from it?

For example

emu.load_flux([2000, 4.5, -1])

You should be able to load all the spectra across those parameter ranges. Once that's confirmed we can debug further.

Aseman7 commented 1 year ago

Thanks a lot for your reply, @mileslucas. You are right; unfortunately, the emulator is not able to load no spectrum and goes to outside of original parameter range. But, appearantly, the process of creating the emulator is correct. How can I solve this problem? I appreciate your help.

emu.save("F_SPEX_emu.hdf5")

emu.load_flux([2200,4.7,-1])

ValueError Traceback (most recent call last) /tmp/ipykernel_10902/2879160586.py in ----> 1 emu.load_flux([2200,4.7,-1])

~/anaconda3/lib/python3.9/site-packages/Starfish/emulator/emulator.py in load_flux(self, params, norm) 419 flux : numpy.ndarray 420 """ --> 421 mu, cov = self(params, reinterpret_batch=False) 422 weights = np.random.multivariate_normal(mu, cov).reshape(-1, self.ncomps) 423 X = self.eigenspectra * self.flux_std

~/anaconda3/lib/python3.9/site-packages/Starfish/emulator/emulator.py in call(self, params, full_cov, reinterpret_batch) 376 # If the pars is outside of the range of emulator values, raise a ModelError 377 if np.any(params < self.min_params) or np.any(params > self.max_params): --> 378 raise ValueError("Querying emulator outside of original parameter range.") 379 380 # Do this according to R&W eqn 2.18, 2.19

ValueError: Querying emulator outside of original parameter range.

mileslucas commented 1 year ago

Can you paste how you created your emulator? The symptoms look like you just don't have all the model spectra you think you do, so I want to double check before looking too deep through the code for bugs.

Aseman7 commented 1 year ago
import numpy as np
from Starfish.grid_tools import download_PHOENIX_models
ranges = [[2000, 4000], [4.0, 5.5], [-1, 1]]  # T, logg, Z
download_PHOENIX_models(path="PHOENIX", ranges=ranges)**
lte04000-5.50+1.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits: 100%|█| 360/360 [00:00]

Setting up a raw grid interface

from Starfish.grid_tools import PHOENIXGridInterfaceNoAlpha
grid = PHOENIXGridInterfaceNoAlpha(path="PHOENIX")**

Setting up the HDF5 interface for this raw grid based on selected ranges.

from Starfish.grid_tools.instruments import SPEX_SXD
from Starfish.grid_tools import HDF5Creator
creator = HDF5Creator( grid, "F_SPEX_grid.hdf5", instrument=SPEX_SXD(), wl_range=(0.9e4, np.inf), ranges=ranges)
creator.process_grid()**
Processing [4.0e+03 5.5e+00 1.0e+00]: 100%|███| 360/360 [04:17<00:00,  1.40it/s]
from Starfish.emulator import Emulator
emu = Emulator.from_grid("F_SPEX_grid.hdf5")
emu
Emulator
--------
Trained: False
lambda_xi: 1.000
Variances:
    10000.00
    10000.00
    10000.00
    10000.00
    10000.00
    10000.00
    10000.00
    10000.00
Lengthscales:
    [ 300.00  1.50  1.50 ]
    [ 300.00  1.50  1.50 ]
    [ 300.00  1.50  1.50 ]
    [ 300.00  1.50  1.50 ]
    [ 300.00  1.50  1.50 ]
    [ 300.00  1.50  1.50 ]
    [ 300.00  1.50  1.50 ]
    [ 300.00  1.50  1.50 ]
Log Likelihood: -5094.12
%time emu.train(options=dict(maxiter=1e5))
emu
CPU times: user 1h 33min 19s, sys: 10min 6s, total: 1h 43min 25s
Wall time: 34min 26s
Out[6]:
Emulator
--------
Trained: True
lambda_xi: 1.007
Variances:
    22009.95
    65936.17
    70057.88
    31126.07
    6100.37
    17248.00
    16650.47
    705.08
Lengthscales:
    [ 524.84  1.82  1.04 ]
    [ 380.02  1.71  1.78 ]
    [ 374.98  1.71  1.26 ]
    [ 430.45  1.63  1.01 ]
    [ 328.49  1.28  2.02 ]
    [ 639.56  1.35  1.02 ]
    [ 560.35  2.44  1.00 ]
    [ 516.53  1.48  1.00 ]
Log Likelihood: -4257.67
%matplotlib inline
from Starfish.emulator.plotting import plot_emulator
plot_emulator(emu)

9CFFA74D-9659-486B-9311-F8773EF9CB59

emu.save("F_SPEX_emu.hdf5")
emu.load_flux([2200,4.7,-1])
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_10902/2879160586.py in <module>
----> 1 emu.load_flux([2200,4.7,-1])

~/anaconda3/lib/python3.9/site-packages/Starfish/emulator/emulator.py in load_flux(self, params, norm)
    419         flux : numpy.ndarray
    420         """
--> 421         mu, cov = self(params, reinterpret_batch=False)
    422         weights = np.random.multivariate_normal(mu, cov).reshape(-1, self.ncomps)
    423         X = self.eigenspectra * self.flux_std

~/anaconda3/lib/python3.9/site-packages/Starfish/emulator/emulator.py in __call__(self, params, full_cov, reinterpret_batch)
    376         # If the pars is outside of the range of emulator values, raise a ModelError
    377         if np.any(params < self.min_params) or np.any(params > self.max_params):
--> 378             raise ValueError("Querying emulator outside of original parameter range.")
    379 
    380         # Do this according to R&W eqn 2.18, 2.19

ValueError: Querying emulator outside of original parameter range.
Aseman7 commented 1 year ago

By testing for more examples, I found that the created emulator gives the error "ValueError: Querying emulator outside of original parameter range" for the range between (2000, 2300) for effective temperature, while it works well for T_eff = [2300, 4000] and all metallicity and log g ranges. Ranges were:

ranges = [[2000, 4000], [4.0, 5.5], [-1, 1]] # T, logg, Z

Therefore, I made another simulator that starts at 2300:

ranges = [[2300, 4000], [4.0, 5.5], [-1, 1]] # T, logg, Z

This time, the emulator loads correctly for all ranges.

So, I went ahead with this new emulator, but I still have the same problem when it comes to replacing normal distribution to uniform distribution in prior values (for example: Starfish completely goes well for these priors

{
    "T": st.norm(2950, 50),
    "logg": st.norm(4.9, 0.0068),
    "Z": st.uniform(-0.8, 0.8),
    "global_cov:log_amp": st.norm(-57, 3),
    "global_cov:log_ls": st.uniform(0, 10),
}

, and it goes wrong when I use these priors

{
    "T": st.uniform(2850, 3050),
    "logg": st.uniform(4.7, 5.1),
    "Z": st.uniform(-0.8, 0.8),
    "global_cov:log_amp": st.norm(-57, 3),
    "global_cov:log_ls": st.uniform(0, 10),
}

. Actually, MCMC's sampling cell gives this error("ValueError: Querying emulator outside of original parameter range" )

I would be very grateful if you could guide me, because I need to use Starfish for some M-dwarfs in my project. If you @mileslucas need to see the codes and my data, please let me know. Thanks a lot.

gully commented 1 year ago

The problem is that PHOENIX models do not extend below Teff<2300 K:

https://phoenix.astro.physik.uni-goettingen.de/?page_id=15

Aseman7 commented 1 year ago

Thanks, @gully, for a good point.

As I said, I changed the T_eff range to [2300, 4000], and the emulator works correctly.

But yet, I get the error when I want to use uniform distribution; do you know how can I solve this problem? Thanks for your time

mileslucas commented 1 year ago

@Aseman7

Thanks for the email, I ran through your model and found the issue, it's actually a fairly innocuous. I noticed that your logg uniform distribution is presumably supposed to be [4.7, 5.1], but I saw in the error call from emcee that it was trying to use a logg of 5.59

emcee: Exception while calling your likelihood function:
  params: [2.96219896e+03 5.58664141e+00 4.64961918e-01]

The problem is your priors are specified incorrectly:

st.uniform(4.7, 5.3)

is NOT a uniform distribution from 4.7 to 5.3- it is a uniform distribution from [4.7, 4.7 + 5.3] (https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.uniform.html)

To achieve what I presume you want, you would instead use

st.uniform(4.7, 0.6)

In your notebook I changed your priors to

priors = {
   "T": st.uniform(2800,300),
    "logg": st.uniform(4.8, 0.5),
    "Z": st.uniform(0.0, 0.8),
    #"Av": st.halfnorm(0, 0.1),
    "global_cov:log_amp": st.norm(-57, 1),
    "global_cov:log_ls": st.uniform(0, 10),
}

and I was able to start sampling- hopefully this works for you.

If you run into this issue again and you can confirm that the params shown in the emcee trace should work based on your priors or if this doesn't solve your problem please reopen this issue.