Open JanKoune opened 2 years ago
A minimal example based on the linear regression integration test:
import numpy as np
from probeye.definition.inverse_problem import InverseProblem
from probeye.definition.forward_model import ForwardModelBase
from probeye.definition.sensor import Sensor
from probeye.definition.likelihood_model import GaussianLikelihoodModel
from probeye.inference.emcee.solver import EmceeSolver
# ============================================================================ #
# Set numeric values #
# ============================================================================ #
# Options
n_steps = 200
n_initial_steps = 100
n_walkers = 20
# 'true' value of a, and its normal prior parameters
a_true = 2.5
mean_a = 2.0
std_a = 1.0
# 'true' value of b, and its normal prior parameters
b_true = 1.7
mean_b = 1.0
std_b = 1.0
# 'true' value of additive error sd, and its uniform prior parameters
sigma = 0.5
low_sigma = 0.0
high_sigma = 0.8
# the number of generated experiment_names and seed for random numbers
n_tests = 50
seed = 1
# ============================================================================ #
# Define the Forward Model #
# ============================================================================ #
class LinearModel(ForwardModelBase):
def interface(self):
self.parameters = [{"a": "m"}, "b"]
self.input_sensors = Sensor("x")
self.output_sensors = Sensor("y", std_model="sigma")
def response(self, inp: dict) -> dict:
# this method *must* be provided by the user
x = inp["x"]
m = inp["m"]
b = inp["b"]
return {"y": m * x + b}
def jacobian(self, inp: dict) -> dict:
x = inp["x"] # vector
one = np.ones((len(x), 1))
return {"y": {"x": None, "m": x.reshape(-1, 1), "b": one}}
# ============================================================================ #
# Define the Inference Problem #
# ============================================================================ #
problem = InverseProblem("Linear regression (AME)")
problem.add_parameter(
"a",
"model",
tex="$a$",
info="Slope of the graph",
prior=("normal", {"mean": mean_a, "std": std_a}),
)
problem.add_parameter(
"b",
"model",
info="Intersection of graph with y-axis",
tex="$b$",
prior=("normal", {"mean": mean_b, "std": std_b}),
)
problem.add_parameter(
"sigma",
"likelihood",
domain="(0, +oo)",
tex=r"$\sigma$",
info="Standard deviation, of zero-mean additive model error",
prior=("uniform", {"low": low_sigma, "high": high_sigma}),
)
# add the forward model to the problem
linear_model = LinearModel("LinearModel")
problem.add_forward_model(linear_model)
# ============================================================================ #
# Add test data to the Inference Problem #
# ============================================================================ #
# data-generation; normal likelihood with constant variance around each point
np.random.seed(seed)
x_test = np.linspace(0.0, 1.0, n_tests)
y_true = linear_model.response(
{linear_model.input_sensor.name: x_test, "m": a_true, "b": b_true}
)[linear_model.output_sensor.name]
y_test = np.random.normal(loc=y_true, scale=sigma)
# add the experimental data
problem.add_experiment(
f"TestSeries_1",
fwd_model_name="LinearModel",
sensor_values={
linear_model.input_sensor.name: x_test,
linear_model.output_sensor.name: y_test,
},
)
# ============================================================================ #
# Add likelihood model(s) #
# ============================================================================ #
# add the likelihood model to the problem
problem.add_likelihood_model(
GaussianLikelihoodModel(
prms_def="sigma",
experiment_name="TestSeries_1",
model_error="additive",
name="SimpleLikelihoodModel",
)
)
# ============================================================================ #
# Solve problem with inference engine(s) #
# ============================================================================ #
emcee_solver = EmceeSolver(
problem,
show_progress=True,
)
inference_data = emcee_solver.run_mcmc(
n_walkers=n_walkers,
n_steps=n_steps,
n_initial_steps=n_initial_steps,
vectorize=True
)
I looked into this, and I saw that there are quite a few changes necessary in order to make this feature possible. But I also think the code will get more complicated, because there will be much more ifs and for-loops in a lot of methods and function. So, it is possible, but at the cost of additional code complexity. If there is not an urgent need, I would not implement this at the moment.
Emcee allows for vectorized calls to the log-likelihood function by setting
vectorize=True
in the EnsembleSampler. With this option enabled, the input to the log-likelihood function is a 2D array oftheta
values of size(n_walkers x n_theta)
. Doing this in probeye results in an error in check_parameter_domains (see minimal example below).Having this functionality could be useful for: