Closed tomasdsousa closed 11 months ago
For a minimum working example, here I create three data points from the line $y = 1$, with $x$ in the range [0,1] and I fit this to the line $y = a_0 + a_1 x + a_2 x^2$. The negative log-likelihood is defined to be infinite if $a_1 \leq 0$, so when the integer snap happens, the DL becomes infinite. This is the behaviour which needs to be removed. The code needs to identify that
import numpy as np
from esr.fitting.likelihood import Likelihood
from esr.fitting.fit_single import single_function
class MyLikelihood(Likelihood):
def __init__(self, nx):
super().__init__('data_file', 'cov_file', 'inf_test',)
self.ylabel = r'$y$'
self.xvar = np.random.uniform(0, 1, size=nx)
self.yerr = np.full(nx, 0.2)
self.yvar = np.ones(nx) + self.yerr * np.random.normal(0, 1, size=nx)
def negloglike(self, a, eq_numpy, **kwargs):
if a[1] <= 0:
return np.inf
ypred = self.get_pred(self.xvar, np.atleast_1d(a), eq_numpy)
if not np.all(np.isreal(ypred)):
return np.inf
nll = np.sum(0.5 * (ypred - self.yvar) ** 2 / self.yerr ** 2 + 0.5 * np.log(2 * np.pi) + np.log(self.yerr))
if np.isnan(nll):
return np.inf
return nll
np.random.seed(1)
likelihood = MyLikelihood(3)
labels = ["+", "a0", "+", "*", "a1", "x", "*", "a2", "pow", "x", "2"]
basis_functions = [["x", "a"], # type0
["inv"], # type1
["+", "*", "-", "/", "pow"]] # type2
logl, dl = single_function(labels,
basis_functions,
likelihood,
verbose=True,)
We now check the effect of setting parameters to zero. We first try setting all parameters with Nsteps<1
to zero and evaluate the log-likelihood. If this is finite, we continue as before. Now, if this doesn't work, we try to set fewer parameters to zero until we find a combination which gives a finite log-likelihood. Once this is found, we use these parameters. The important changes to the code are
The output for the above example is
Max number of parameters: 3
Sympy simplify 3
theta_ML: [0.77880209 0.09519103 0. ]
Residuals: -2.0165679395645437 -2.0714981376882804
Parameter: -0.11582338820956806
Function: 18.396964217335046
Description length: 16.264572889560934
so it passes the two tests above.
@tomasdsousa please can you verify that this modification gives the expected result for your problem?
Yes, thank you! Just had to move line 149 outside of the if block, because of line 195.
Good spot! This is fixed in 6af33044cd8916a5a4f726a1209e620499e2e6a3
(In test_all_Fisher.py.) Setting a parameter to zero might return a nll of inf/Nan, in which case it would be better to keep the original value. It might also be beneficial to loop over theta_ML[Nsteps<1] to see which parameter(s) is causing this behaviour when theta_ML[Nsteps<1]>1.