Linear analytic solution

giacomomagni commented 9 months ago

Address feature #60

[x] Add the analytic module
[x] Add some docs

codecov[bot] commented 9 months ago

Codecov Report

Merging #62 (5bd14bf) into main (7fdcee0) will increase coverage by 0.16%. The diff coverage is 37.75%.

Additional details and impacted files

[![Impacted file tree graph](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/62/graphs/tree.svg?width=650&height=150&src=pr&token=MRTEXUP8XU&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef)](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/62?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef) ```diff @@ Coverage Diff @@ ## main #62 +/- ## ========================================== + Coverage 42.85% 43.01% +0.16% ========================================== Files 28 29 +1 Lines 2245 2318 +73 ========================================== + Hits 962 997 +35 - Misses 1283 1321 +38 ``` | [Flag](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/62/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/62/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef) | `43.01% <37.75%> (+0.16%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef#carryforward-flags-in-the-pull-request-comment) to find out more. | [Files](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/62?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef) | Coverage Δ | | |---|---|---| | [src/smefit/optimize/\_\_init\_\_.py](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/62?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef#diff-c3JjL3NtZWZpdC9vcHRpbWl6ZS9fX2luaXRfXy5weQ==) | `61.66% <ø> (ø)` | | | [src/smefit/optimize/mc.py](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/62?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef#diff-c3JjL3NtZWZpdC9vcHRpbWl6ZS9tYy5weQ==) | `55.20% <ø> (-0.47%)` | :arrow_down: | | [src/smefit/optimize/ultranest.py](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/62?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef#diff-c3JjL3NtZWZpdC9vcHRpbWl6ZS91bHRyYW5lc3QucHk=) | `62.00% <ø> (-0.38%)` | :arrow_down: | | [src/smefit/prefit/\_\_init\_\_.py](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/62?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef#diff-c3JjL3NtZWZpdC9wcmVmaXQvX19pbml0X18ucHk=) | `0.00% <ø> (ø)` | | | [src/smefit/cli/\_\_init\_\_.py](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/62?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef#diff-c3JjL3NtZWZpdC9jbGkvX19pbml0X18ucHk=) | `0.00% <0.00%> (ø)` | | | [src/smefit/optimize/analytic.py](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/62?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef#diff-c3JjL3NtZWZpdC9vcHRpbWl6ZS9hbmFseXRpYy5weQ==) | `60.71% <60.71%> (ø)` | | | [src/smefit/runner.py](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/62?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef#diff-c3JjL3NtZWZpdC9ydW5uZXIucHk=) | `46.55% <10.34%> (-0.24%)` | :arrow_down: |

giacomomagni commented 9 months ago

One of the rare time that something works out of the box! :relieved: :relieved: :relieved:

compare_analytic_ultranest_linear.zip

giacomomagni commented 9 months ago

I still want to add some docs, but you might check if the code is doing what you had in mind. To run:

smeift A  your_path_to_runcard

There is a single additional parameter required in the runcard: n_samples that has to be set according the number of samples that you desire, maybe we should replace it by something as the relative accuracy that you want ?

I also added an explicit check if the inverse of the fisher matrix is not positive semidefinite, as I think the method will not work in case of flat directions.

LucaMantani commented 9 months ago

Yes indeed, if there are flat direction the cov mat should be singular and it would not work well! Looks like it's working indeed :)

jacoterh commented 9 months ago

Amazing @giacomomagni, looks really nice! Do we know the reason why the posteriors look different for some of the 4 heavy operators? Is this related to the Hessian being singular?

giacomomagni commented 9 months ago

Amazing @giacomomagni, looks really nice! Do we know the reason why the posteriors look different for some of the 4 heavy operators? Is this related to the Hessian being singular?

I had to remove the 4 heavy operators, because of the flat directions, I left only Ott which is indeed highly correlated to the others...

So do you prefer to have n_sample or the relative accuracy ?

jacoterh commented 9 months ago

I am not sure why we even need samples in the first place if we already know the analytic form of the multivariate gaussian. Can't we obtain the bounds directly without sampling?

LucaMantani commented 9 months ago

I think it's still good to sample, so that one can use all the plotting routines and analysis ones. It's not a very costly operation, I guess it runs really fast @giacomomagni ?

How would the relative accuracy works? I think it's more intuitive to specify the number of samples.

giacomomagni commented 9 months ago

I am not sure why we even need samples in the first place if we already know the analytic form of the multivariate gaussian. Can't we obtain the bounds directly without sampling?

Bounds are logged in a table at the end of the run. I believe samples are just needed to make comparisons with the other tool/methods easier. Also without samples, you would need to save the full covmat, otherwise correlations are lost.

giacomomagni commented 9 months ago

I think it's still good to sample, so that one can use all the plotting routines and analysis ones. It's not a very costly operation, I guess it runs really fast @giacomomagni ?

yes it's immediate...

How would the relative accuracy works? I think it's more intuitive to specify the number of samples.

Something like, you draw samples until the std/mean of the samples do not reach the required accuracy wrt to the known values ?!

LucaMantani commented 9 months ago

I think it's still good to sample, so that one can use all the plotting routines and analysis ones. It's not a very costly operation, I guess it runs really fast @giacomomagni ?

yes it's immediate...

How would the relative accuracy works? I think it's more intuitive to specify the number of samples.

Something like, you draw samples until the std/mean of the samples do not reach the required accuracy wrt to the known values ?!

Ah ok, not sure. I think it's fine with number of samples too.

jacoterh commented 9 months ago

We could continue sampling until the relative error on the mean reaches below 1%, so: std/(\sqrt{N} * central value) < 0.01. But let me stress that everything in the report can be computed analytically: the marginalised posteriors can be computed in closed form, the correlation matrix we have from the covariance... etc

giacomomagni commented 9 months ago

Okay, let's leave the number of samples for the time being. If any of you can review this, then we should be able to merge.

LucaMantani commented 9 months ago

While reading I saw some typos in the running.md, but the rest looks fine to me :)

LHCfitNikhef / smefit_release

Linear analytic solution #62

Codecov Report