Unrealistically large systematic errors on the flux points are assumed to estimate the errors on the model parameters

cosimoNigro commented 3 years ago

The reduced statistics (even removing a model parameter - I am now fixing the blob radius with the time variability), is high. Sherpa actually prevents the computation of errors with the confidence method in case the final reduced statistics is larger than 3, see the discussion here.

The only thing I could do to make the reduced statistics smaller (beside freezing one of the model parameters, R_b) was to include systematic errors on the normalisation of flux points. The ones that I have to include to get a decent fit statistics (<3) are largely overestimated: I am considering 15% for gamma-ray instruments and 10% at lower energies. I think a more correct figure should be 10% at gamma-ray energies and a few percent (1-3%) for lower-energy instruments (i.e. in the X-ray or UV band).

@jsitarek, @davidsanchez, please try the fitting scripts and let me know how we can improve this (freeze other parameters, correct the systematics, re-bin etc...).

gammapy and sherpa fits now produce diagnostic plots: stats profile for sherpa and correlation matrix for gammapy. I don't produce stats profile for gammapy because it's painfully slow.

jsitarek commented 3 years ago

Hi @cosimoNigro I would not say that 15% systematics is unrealistically high. What we claim in MAGIC is 15% but in energy scale, which together with some extra systematics in flux normalization ends up in ~30% systematic uncertainty for a not very steep spectrum. This 30% is also what is normally used in the U.L. calculations as well. However treating systematic uncertainties as yet one more statistics is not very correct, while it can be fine for just trying to get a decent fit, the fit parameter uncertainties derived from such a procedure would be pretty artificial. In case of IACTs one could try to the fit a nuisance parameter that scales up or down the whole VHE spectrum, but doing it at lower energies can be tricky. Next week I will have a look at the actual script and check if something reasonable can be done. The fallback solution (that you will probably not like, but it might be the most sensible thing to do taking into account all the limitations) is to only report the values and not the uncertainties.

davidsanchez commented 3 years ago

Hi I have never try this part of the code but I have an -yet not fully tested- piece of code that use MCMC for the fit. I can have a look at the same data as you and see what I get from this david

cosimoNigro commented 3 years ago

Thanks @davidsanchez, it would be interesting to check the results also with MCMC fitting. Do you want to include this MCMC fitting in the paper, or do you propose it just as a crosscheck? I think it's too complex a method to be inserted in a paper whose objective is just to present the code and show how it can be interfaced with a fitting software.

We agreed with @jsitarek just to report the best-fit values of the parameters and to not report the errors on them, as we cannot properly assess the systematic uncertainties on the individual flux points, therefore probably underestimating or overestimating the final error on the parameters. Nonetheless, @jsitarek, which systematic error should we use finally on the flux points when we perform the fit? Given your comment I propose 30% on gamma-ray instruments and 5% on low energy instruments. Do you agree?

jsitarek commented 3 years ago

Hi,

For IACTs (>100 GeV) I think you can apply 30%, this value is used in many U.L. papers, and one can get it from the numbers in the MAGIC performance paper for a reasonable source spectrum.

For Fermi however this should be smaller. The energy scale is quite precise for it (2%-5%), and they have a few systematic effect at a few % level, so I think if you use 10% it makes sense.

concerning the low energies (optical, X-ray instruments), I have no feeling about systematics, so probably using this 5% from Vandad is the best. Unfortunately this is also the most important number, because those measurements have small statistical uncertainties, so probably the fit will be pulled towards this low state spectra.

davidsanchez commented 3 years ago

@cosimoNigro As a crosscheck yes. in any case having this in hands never hurt. do you have the data somewhere?

cosimoNigro commented 3 years ago

Hi @davidsanchez, all the data are in the agnpy package itself. I fetch them with pkg_resources

sed_path = pkg_resources.resource_filename("agnpy", "data/mwl_seds/Mrk421_2011.ecsv")
sed_table = Table.read(sed_path)
x = sed_table["nu"].to("eV", equivalencies=u.spectral())
y = sed_table["nuFnu"].to("erg cm-2 s-1")
y_err_stat = sed_table["nuFnu_err"].to("erg cm-2 s-1")

watch out, some of them have energies and some frequencies in their respective tables. I will open an issue in agnpy to make their format uniform. Still, all the SEDs are in the main repo.

davidsanchez commented 3 years ago

Thanks a lot I'll do that asap best david

cosimoNigro commented 3 years ago

As discussed with @jsitarek and with the sherpa developers, I removed the error estimation from the fitting scripts (see PR #5). The lack of a precise value for the systematic uncertainties we assume and the high reduced statistics we obtain does not allow us a proper error estimation.

davidsanchez commented 3 years ago

Hi I had a look at the code for Mrk421 and manage to define my own stat instead of using chi2. This is technically working but might be wrong. This also can allow us to make chi2 profils and estimate the errors like this.

def func(data,model,staterror,syserror=None,weight=None):
    staterror*=staterror
    staterror += syserror**2
    staterror = np.sqrt(staterror)

    difference = (model.value) - (data)
    logprob = difference ** 2 / (
        2.0 * (staterror** 2))

    return np.sum(logprob),logprob

import sherpa
from sherpa.stats import UserStat
mystat = UserStat(func)

fitter = Fit(sed, agnpy_ssc, stat=mystat, method=LevMar())

jsitarek commented 3 years ago

Hi @davidsanchez but isn't the behaviour the same as in the original code? @cosimoNigro included systematic uncertainties, and as far as I understand this systematic uncertainty was treaten as you do in this code (added in quadrature to statistical one). The main problem is that in reality those errors are strongly correlated between the points coming from the same instrument, and also the systematic uncertainty values are only very rough guesses (and their "distributions" can be basically everything),

davidsanchez commented 3 years ago

Hi @jsitarek I get different results so my simple code might be different from Sherpa (not sure this is correct). With this we can include any correlation that we can estimate. Of course this does not solve the issue of the systematic uncertainty values you mentioned

cosimoNigro commented 3 years ago

Hello @davidsanchez, @jsitarek

as far as I understand from David's snippet, he is minimising (chi^2/2) instead of the normal chi^2, which is what sherpa does. I found only this post debating which type of chi^2 should be minimised. It's not really clear to me, looking at the paper they point out in the selected answer, the argument for one over the other. Also the question seems related to MCMC fitting.

I have some questions: 1) I could see the numerical advantage, eventually, of minimising the log(chi^2), but what is the numerical advantage of chi^2/2 w.r.t. chi^2? 2) what do you mean by different results? Different final statistics - this we should expect - or different best-fit parameters? This should not happen... 3) why in the post they suggest to minimise -chi^2/2, why the minus sign?

The only trade-off I can think to is a weighted chi^2 somehow balancing the numerous and precise measurements on the synchrotron bump against the sparse and uncertain measurement on the IC bump. But I don't want to enter in the detail of the statistics implementation in the paper description.

Let me know what you think...

jsitarek commented 3 years ago

Hi @cosimoNigro @davidsanchez It should obviously not matter if you normalize chi2 or chi2/2 for the actual minimum value. Only for the uncertainties one has to be careful, because when you compute them from the chi2 profile it takes a specific difference in chi2 as the measure of the error, so there a multiplicative factor would matter. The confusion between chi2, chi2/2 log chi2/2 probably comes from the difference between chi2 and likelihood. If you assume that the distribution of values is Gaussian (as we do) and you compute the log likelihood of it you end up exactly with the chi2 statistics, so under such assumption minimalization of chi2 or maximization of log likelihood is perfectly equivalent. You would get a different value from log likelihood only if you assume a different distribution (e.g. Poissonian).

davidsanchez commented 3 years ago

I'll try with chi^2/2 for the errors computation.

jsitarek commented 3 years ago

Hi @davidsanchez

but you should be certain that the code actually expects chi2/2 otherwise the uncertainties would not be computed properly. Either way, despite the comment of Matteo I think the idea was still not to include uncertainties.

davidsanchez commented 3 years ago

Sure. I agree to not include the uncertainties

cosimoNigro / agnpy_paper

Unrealistically large systematic errors on the flux points are assumed to estimate the errors on the model parameters #3