Priors introduce offset in logLikelihood

jballet commented 4 years ago

I have tried introducing a (very broad) prior on the spectral index of sources, in order to eliminate the need for hard limits on that parameter. This worked fine ... except the TS values were all wrong. I traced that to the fact that setting a prior on a parameter adds a positive offset to the logLikelihood, whose value is equal to the sigma of the prior. Because of that the source TS increases by 2 sigma (in my case, by 4). This is of course very annoying, and I don't think it can be intended, so it must be a bug, and should be easy to fix. I attach an illustration in which I centered the prior on the best fit of the index, so adding the priors should have changed exactly nothing. test_priorsTS.zip

eacharles commented 3 years ago

Matt Wood had a similar discussion with some of the CTA folks a while back. The issue of course is that formally adding the prior does change the log-likelihood by exactly that value. I think that both Matt and I felt that the implementation that you request would have been better, but for consistency with how they do things we left the implementation as it was.

Unfortunately, there is already some code out there that relies on the current implementation of the prior in the Likelihood package.

Let me suggest that the implementation of the Ts() function in pyLikelihood.AnalysisBase is formally incorrect. https://github.com/fermi-lat/pyLikelihood/blob/master/python/AnalysisBase.py

What it does is to delete the test source, including the prior on the index in order to get the null hypothesis. I don't think that is what Wilkes theorem says to do, rather I think it should fix the prefactor to zero and fit the model and use that as the null hypothesis. The prior will then force the index to its central value.

eacharles commented 3 years ago

For example, Matt found that he had to re-implement Ts() to do the right thing in fermipy. See the Ts2 function here: https://github.com/fermiPy/fermipy/blob/master/fermipy/gtutils.py

eacharles commented 3 years ago

And yes, it would have been better if Matt had just fixed the AnalysisBase.Ts() function in pyLikelihood back in the day, but so it goes. Let me suggest that we should focus on fixing AnalysisBase.Ts() to do the right thing.

jballet commented 3 years ago

I agree that what you are suggesting will provide the right TS, but I am not sure how it articulates with the Ts call when there are no priors. It looks dangerous to me to have two different implementations, and certainly leaving the parameters other than normPar free when there are no priors will never converge. If the source is not deleted, all spectral parameters should be fixed (their value will not affect the result) when there are no priors. To be clear, your solution would apply whether or not the reoptimize option is set, right?

There are other complications that worry me. That will impact the TSCurv calculations, which are done by comparing (for instance) PowerLaw and LogParabola directly at logLikelihood level. If I set broad priors to Index for PowerLaw and to alpha and beta for LogParabola, that will again increase my TSCurv values, I fear. You will probably answer that I should compute TSCurv by comparing LogParabola with free beta to LogParabola with fixed beta and I agree, but it is not the way it was implemented historically.

In general, I don't like the idea that logLikelihoods cannot be compared any more. I also admit that I do not understand why adding priors should increase logLike by sigma. What is the theory behind that?

eacharles commented 3 years ago

Ok, we should probably open up the discussion a bit and agree on what behavior we want. I’ll track down all the places that this affects things in fermipy. Similarly we should identify where this affects things in pyLikelihood.

On Jul 10, 2020, at 2:49 AM, jballet notifications@github.com wrote:

I agree that what you are suggesting will provide the right TS, but I am not sure how it articulates with the Ts call when there are no priors. It looks dangerous to me to have two different implementations, and certainly leaving the parameters other than normPar free when there are no priors will never converge. If the source is not deleted, all spectral parameters should be fixed (their value will not affect the result) when there are no priors. To be clear, your solution would apply whether or not the reoptimize option is set, right?

There are other complications that worry me. That will impact the TSCurv calculations, which are done by comparing (for instance) PowerLaw and LogParabola directly at logLikelihood level. If I set broad priors to Index for PowerLaw and to alpha and beta for LogParabola, that will again increase my TSCurv values, I fear. You will probably answer that I should compute TSCurv by comparing LogParabola with free beta to LogParabola with fixed beta and I agree, but it is not the way it was implemented historically.

In general, I don't like the idea that logLikelihoods cannot be compared any more. I also admit that I do not understand why adding priors should increase logLike by sigma. What is the theory behind that?

Well, I think it is the convention for the definition of the prior. It is unit normalized, (i.e., the parameter has to take some value) so for a Gaussian, the maximum value depends on the width of the distribution, so you get offset by sigma. The ln (sqrt (2 pi) ) offset seems to be missing though.

Note that the c++ code lets you use any function for the prior, and Jim has implemented a GaussianError class (Likelihood/src/GaussianError.cxx) that lets you play with the normalization and offset of the Gaussian. Depending on how you invoke setting the priors in the python code you can force it to have a normalization so the the peak value is 1 (i.e., the log of the prior is 0 and It doesn’t affect the TS).

Look at pyLikelihood/python/SrcModel.py to see how the prior is being constructed.

-e

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/fermi-lat/Likelihood/issues/89#issuecomment-656589970, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADRIGIRFNGKJT7CP4L4OYY3R23P35ANCNFSM4OS7EYZA.

jballet commented 3 years ago

If the offset came from the normalization, one would expect it to be log(sigma) instead of sigma itself, so the reason must be elsewhere. We may broach the topic at the next C&A meeting (Jul 20) if you wish. The GaussianError class may suit my needs, but how do I use it? Sorry if I look dumb, but just looking at SrcModel.py did not help me.

eacharles commented 3 years ago

It looks like your example is using the function:

AnalysisBase.addGaussianPrior from pyLikelihood/python/AnalysisBase

That calls

Parameter.addGaussianPrior from pyLikelihood/python/SrcModel.py.

That creates a C++ object of type GaussianError (Likelihood/src/GaussianError.cxx) and puts it into the C++ (optimizers/src/Parameter.cxx).

However, looking at that function, it is actually setting the log of the prior, not the prior itself.

void Parameter::setPrior(Function & log_prior) { m_log_prior = &log_prior; }

Now, the GaussianError C++ class has four parameters, the default values are defined in the constructor like this:

GaussianError(double norm=1, double mean=0, double sigma=1, double offset=0);

And the value is computed like this:

double GaussianError::value(const optimizers::Arg & xarg) const { double x = dynamic_cast<const optimizers::dArg &>(xarg).getValue(); double norm = m_parameter[Norm].getTrueValue(); double mean = m_parameter[Mean].getTrueValue(); double sigma = m_parameter[Sigma].getTrueValue(); double offset = m_parameter[Sigma].getTrueValue();

double retVal = -(x - mean)*(x - mean)/(2.*sigma*sigma);
retVal += offset;
retVal *= norm;      
return retVal;

}

So, it looks like there is a bug here, it is using the value for sigma instead of the offset. I think that fixing that will get rid of the step-like behavior your are seeing. I’ll do that now.

-e

On Jul 15, 2020, at 3:09 AM, jballet notifications@github.com wrote:

If the offset came from the normalization, one would expect it to be log(sigma) instead of sigma itself, so the reason must be elsewhere. We may broach the topic at the next C&A meeting (Jul 20) if you wish. The GaussianError class may suit my needs, but how do I use it? Sorry if I look dumb, but just looking at SrcModel.py did not help me.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/fermi-lat/Likelihood/issues/89#issuecomment-658678955, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADRIGITQHB5GF7WB2SMV3WDR3V54BANCNFSM4OS7EYZA.

eacharles commented 3 years ago

So, to follow up, on the previous post. I think that I need to fix the code above to use set the offset correctly. I should add something to the python addGaussianPrior() function that lets the user set the offset and norm. However, the behavior you want correspond to the default values of having the norm be 1 and offset be 0, so fixing this should give you the behavior you want.

jballet commented 3 years ago

Indeed. So this was a simple bug after all. Very good. No need to launch a philosophical discussion then.

eacharles commented 3 years ago

Well, so long as you use the GaussianError class to define the prior, yes.

jballet commented 3 years ago

I don't see any change in FT 1.3.4. Are you sure it has been integrated?

jballet commented 3 years ago

I have checked FT 1.3.5 and the logLikelihood offset has gone away.

fermi-lat / Likelihood

Priors introduce offset in logLikelihood #89