tvwenger / bayes_spec

A Bayesian Spectral Line Modeling Framework
MIT License
5 stars 2 forks source link

Unable to Check Prior Distributions #24

Closed danabalser closed 3 weeks ago

danabalser commented 1 month ago

@tvwenger I had problems checking the prior distribution. Following the HFSModel Tutorial using real ALMA data here is what I get:

(bayes_cn_hfs) [dbalser@zuul03 bayes_cn_hfs]$ ipython
Python 3.12.5 | packaged by conda-forge | (main, Aug  8 2024, 18:36:51) [GCC 12.4.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.27.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: # Target
   ...: src='G005.885'
   ...: 
   ...: # =================
   ...: # General imports |
   ...: # =================
   ...: 
   ...: import os
   ...: import pickle
   ...: import time
   ...: 
   ...: import matplotlib.pyplot as plt
   ...: import arviz as az
   ...: import pandas as pd
   ...: import numpy as np
   ...: import pymc as pm
   ...: 
   ...: print("pymc version:", pm.__version__)
   ...: 
   ...: import bayes_spec
   ...: print("bayes_spec version:", bayes_spec.__version__)
   ...: 
   ...: import bayes_cn_hfs
   ...: print("bayes_cn_hfs version:", bayes_cn_hfs.__version__)
   ...: 
   ...: from bayes_cn_hfs import HFSModel
   ...: 
   ...: # Notebook configuration
   ...: pd.options.display.max_rows = None
   ...: 
   ...: # random state
   ...: rng = np.random.RandomState(seed=1234)
   ...: 
pymc version: 5.16.2
bayes_spec version: 1.6.3
bayes_cn_hfs version: 1.0.0

   ...: 
   ...: from bayes_spec import SpecData
   ...: 
   ...: data_12CN = np.genfromtxt("G005.885_CN.tsv")
   ...: data_13CN = np.genfromtxt("G005.885_13CN.tsv")
   ...: 
   ...: # estimate noise
   ...: noise_12CN = 1.4826 * np.median(np.abs(data_12CN[:, 0] - np.median(data_12CN[:, 0])))
   ...: noise_13CN = 1.4826 * np.median(np.abs(data_13CN[:, 0] - np.median(data_13CN[:, 0])))
   ...: 
   ...: obs_12CN = SpecData(
   ...:     1000.0 * data_12CN[:, 0],
   ...:     data_12CN[:, 1],
   ...:     noise_12CN,
   ...:     xlabel=r"LSRK Frequency (MHz)",
   ...:     ylabel=r"$T_{B,\,\rm CN}$ (K)",
   ...: )
   ...: obs_13CN = SpecData(
   ...:     1000.0 * data_13CN[:, 0],
   ...:     data_13CN[:, 1],
   ...:     noise_13CN,
   ...:     xlabel=r"LSRK Frequency (MHz)",
   ...:     ylabel=r"$T_{B,\,\rm CN}$ (K)",
   ...: )
   ...: data_12CN = {"observation": obs_12CN}
   ...: data = {"12CN": obs_12CN, "13CN": obs_13CN}
   ...: 
   ...: # Plot the data
   ...: fig, axes = plt.subplots(2, layout="constrained")
   ...: axes[0].plot(data["12CN"].spectral, data["12CN"].brightness, 'k-')
   ...: axes[1].plot(data["13CN"].spectral, data["13CN"].brightness, 'k-')
   ...: axes[0].set_xlabel(data["12CN"].xlabel)
   ...: axes[1].set_xlabel(data["13CN"].xlabel)
   ...: axes[0].set_ylabel(data["12CN"].ylabel)
   ...: _ = axes[1].set_ylabel(data["13CN"].ylabel)
   ...: plt.savefig(src+'_spectra.png')
   ...: 

In [3]: # =======================
   ...: # Import Molecular Data |
   ...: # =======================
   ...: 
   ...: from bayes_cn_hfs import get_molecule_data
   ...: 
   ...: mol_data_12CN = get_molecule_data(
   ...:     "CN, v = 0, 1", # molecule name in JPLSpec
   ...:     vibrational_state = 0, # vibrational state number
   ...:     rot_state_lower = 0, # lower rotational state
   ...: )
   ...: 
   ...: 

In [4]: # ==================
   ...: # Model Definition |
   ...: # ==================
   ...: 
   ...: from bayes_cn_hfs import HFSModel
   ...: 
   ...: # Initialize and define the model
   ...: n_clouds = 3
   ...: baseline_degree = 3
   ...: model = HFSModel(
   ...:     data=data_12CN,
   ...:     mol_data=mol_data_12CN, # molecular data
   ...:     bg_temp = 2.7, # assumed background temperature (K)
   ...:     n_clouds=n_clouds,
   ...:     baseline_degree=baseline_degree,
   ...:     seed=1234,
   ...:     verbose=True
   ...: )
   ...: model.add_priors(
   ...:     prior_log10_N = [14.0, 1.0], # mean and width of log10(N) prior (cm-2)
   ...:     prior_log10_tex = [1.0, 0.1], # mean and width of log10(Tex) prior (K)
   ...:     prior_fwhm = 1.0, # width of FWHM prior (km/s)
   ...:     prior_velocity = [-2.6, 10.0], # mean and width of velocity prior (km/s)
   ...:     prior_rms = 0.1, # width of spectral rms prior (K)
   ...:     prior_baseline_coeffs = [1.0, 1.0, 1.0], # width of normalized polynomial baseline coeffs
   ...:     ordered = False, # do not assume optically-thin
   ...: )
   ...: model.add_likelihood()
   ...: 
   ...: 

In [5]: # ===========================
   ...: # Check Prior Distributions |
   ...: # ===========================
   ...: 
   ...: from bayes_spec.plots import plot_predictive
   ...: 
   ...: # prior predictive check
   ...: prior = model.sample_prior_predictive(
   ...:     samples=100,  # prior predictive samples
   ...: )
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
File ~/miniconda3/envs/bayes_cn_hfs/lib/python3.12/site-packages/pytensor/compile/function/types.py:959, in Function.__call__(self, *args, **kwargs)                                                                                                                                                
    957 try:
    958     outputs = (
--> 959         self.vm()
    960         if output_subset is None
    961         else self.vm(output_subset=output_subset)
    962     )
    963 except Exception:

IndexError: index out of bounds

During handling of the above exception, another exception occurred:

IndexError                                Traceback (most recent call last)
Cell In[5], line 8
      5 from bayes_spec.plots import plot_predictive
      7 # prior predictive check
----> 8 prior = model.sample_prior_predictive(
      9     samples=100,  # prior predictive samples
     10 )

File ~/miniconda3/envs/bayes_cn_hfs/lib/python3.12/site-packages/bayes_spec/base_model.py:411, in BaseModel.sample_prior_predictive(self, samples)
    403 """Generate prior predictive samples
    404 
    405 :param samples: Number of prior predictive samples to draw, defaults to 50
   (...)
    408 :rtype: az.InferenceData
    409 """
    410 # validate
--> 411 assert self._validate()
    413 with self.model:
    414     trace = pm.sample_prior_predictive(samples=samples, random_seed=self.seed)

File ~/miniconda3/envs/bayes_cn_hfs/lib/python3.12/site-packages/bayes_spec/base_model.py:232, in BaseModel._validate(self)
    229     raise ValueError("No observed variables found! Did you add_likelihood()?")
    231 # check that model can be evaluated
--> 232 if not np.isfinite(self.model.logp().eval(self.model.initial_point())):
    233     raise ValueError("Model initial point is not finite! Mis-specified model or bad priors?")
    235 return True

File ~/miniconda3/envs/bayes_cn_hfs/lib/python3.12/site-packages/pytensor/graph/basic.py:652, in Variable.eval(self, inputs_to_values, **kwargs)
    646         warnings.warn(
    647             "Keyword arguments could not be used to create a cache key for the underlying variable. "
    648             f"A function will be recompiled on every call with such keyword arguments.\n{exc}"
    649         )
    651 args = [parsed_inputs_to_values[param] for param in inputs]
--> 652 return fn(*args)

File ~/miniconda3/envs/bayes_cn_hfs/lib/python3.12/site-packages/pytensor/compile/function/types.py:972, in Function.__call__(self, *args, **kwargs)                                                                                                                                                
    970     if hasattr(self.vm, "thunks"):
    971         thunk = self.vm.thunks[self.vm.position_of_error]
--> 972     raise_with_op(
    973         self.maker.fgraph,
    974         node=self.vm.nodes[self.vm.position_of_error],
    975         thunk=thunk,
    976         storage_map=getattr(self.vm, "storage_map", None),
    977     )
    978 else:
    979     # old-style linkers raise their own exceptions
    980     raise

File ~/miniconda3/envs/bayes_cn_hfs/lib/python3.12/site-packages/pytensor/link/utils.py:524, in raise_with_op(fgraph, node, thunk, exc_info, storage_map)                                                                                                                                           
    519     warnings.warn(
    520         f"{exc_type} error does not allow us to add an extra error message"
    521     )
    522     # Some exception need extra parameter in inputs. So forget the
    523     # extra long error message in that case.
--> 524 raise exc_value.with_traceback(exc_trace)

File ~/miniconda3/envs/bayes_cn_hfs/lib/python3.12/site-packages/pytensor/compile/function/types.py:959, in Function.__call__(self, *args, **kwargs)                                                                                                                                                
    956 t0_fn = time.perf_counter()
    957 try:
    958     outputs = (
--> 959         self.vm()
    960         if output_subset is None
    961         else self.vm(output_subset=output_subset)
    962     )
    963 except Exception:
    964     restore_defaults()

IndexError: index out of bounds
Apply node that caused the error: Subtensor{i}(baseline_observation_norm, 3)
Toposort index: 15
Inputs types: [TensorType(float64, shape=(None,)), ScalarType(uint8)]
Inputs shapes: [(3,), ()]
Inputs strides: [(8,), ()]
Inputs values: [array([0., 0., 0.]), 3]
Outputs clients: [[ExpandDims{axis=0}(Subtensor{i}.0)]]

Backtrace when the node is created (use PyTensor flag traceback__limit=N to make it longer):
  File "/users/dbalser/miniconda3/envs/bayes_cn_hfs/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3130, in _run_cell
    result = runner(coro)
  File "/users/dbalser/miniconda3/envs/bayes_cn_hfs/lib/python3.12/site-packages/IPython/core/async_helpers.py", line 128, in _pseudo_sync_runner
    coro.send(None)
  File "/users/dbalser/miniconda3/envs/bayes_cn_hfs/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3334, in run_cell_async
    has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
  File "/users/dbalser/miniconda3/envs/bayes_cn_hfs/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3517, in run_ast_nodes
    if await self.run_code(code, result, async_=asy):
  File "/users/dbalser/miniconda3/envs/bayes_cn_hfs/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-1880848d414d>", line 28, in <module>
    model.add_likelihood()
  File "/users/dbalser/miniconda3/envs/bayes_cn_hfs/lib/python3.12/site-packages/bayes_cn_hfs/hfs_model.py", line 189, in add_likelihood
    baseline_models = self.predict_baseline()
  File "/users/dbalser/miniconda3/envs/bayes_cn_hfs/lib/python3.12/site-packages/bayes_spec/base_model.py", line 394, in predict_baseline
    self.model[f"baseline_{key}_norm"][i] / (i + 1.0) ** i * dataset.spectral_norm**i

HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.

In [6]:
tvwenger commented 1 month ago

@danabalser The parameter prior_baseline_coeffs must have length equal to the number of baseline polynomial coefficients, which is baseline_degree + 1. In this case, you have specified baseline_degree = 3 so prior_baseline_coeffs must have length 4. Try: prior_baseline_coeffs = [1.0, 1.0, 1.0, 1.0] or just prior_baseline_coeffs = None.

I will update the model to produce a more informative error message!

danabalser commented 1 month ago

@tvwenger Okay, that worked. Thanks.

tvwenger commented 3 weeks ago

Better error message implemented by https://github.com/tvwenger/bayes_spec/pull/32