LHCfitNikhef / smefit_release

SMEFiT a Standard Model Effective Field Theory fitter
GNU General Public License v3.0
6 stars 1 forks source link

Projections #57

Closed tgiani closed 6 months ago

tgiani commented 1 year ago

This PR should allow the user to create projection starting from an existing dataset. The central value should be given either by the sm prediction or by sm + a set of EFT correction specified by the user.

codecov[bot] commented 1 year ago

Codecov Report

Attention: Patch coverage is 0% with 72 lines in your changes are missing coverage. Please review.

Project coverage is 41.95%. Comparing base (2f13519) to head (46ff994). Report is 11 commits behind head on main.

:exclamation: Current head 46ff994 differs from pull request most recent head 9434bc3. Consider uploading reports for the commit 9434bc3 to get more accurate results

Additional details and impacted files [![Impacted file tree graph](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/57/graphs/tree.svg?width=650&height=150&src=pr&token=MRTEXUP8XU&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef)](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/57?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef) ```diff @@ Coverage Diff @@ ## main #57 +/- ## ========================================== - Coverage 42.01% 41.95% -0.06% ========================================== Files 29 28 -1 Lines 2378 2255 -123 ========================================== - Hits 999 946 -53 + Misses 1379 1309 -70 ``` | [Flag](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/57/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/57/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef) | `41.95% <0.00%> (-0.06%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef#carryforward-flags-in-the-pull-request-comment) to find out more. | [Files](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/57?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef) | Coverage Δ | | |---|---|---| | [src/smefit/cli/\_\_init\_\_.py](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/57?src=pr&el=tree&filepath=src%2Fsmefit%2Fcli%2F__init__.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef#diff-c3JjL3NtZWZpdC9jbGkvX19pbml0X18ucHk=) | `0.00% <0.00%> (ø)` | | | [src/smefit/projections/\_\_init\_\_.py](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/57?src=pr&el=tree&filepath=src%2Fsmefit%2Fprojections%2F__init__.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef#diff-c3JjL3NtZWZpdC9wcm9qZWN0aW9ucy9fX2luaXRfXy5weQ==) | `0.00% <0.00%> (ø)` | | ... and [11 files with indirect coverage changes](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/57/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef)
tgiani commented 1 year ago

@jacoterh @giacomomagni There is still some cleaning to be done and I should add some tests and update the docs, but if you have time please start having a look. The idea is that you have a new runcard runcards/projection.yaml where you specify the info needed to build the projection and then you can use smefit PROJ runcards/projection.yaml -r 0.1 where 0.1 is the factor you want to use to reduce the stat error. In the runcard you have to specify the datasets for which you want to build projections and and the values of the wilson coeff you want to use for the central values (in case you want to build central values which are not sm-like). The code will create a new folder projection where the new dataset is saved.

jacoterh commented 1 year ago

Hi @tgiani, thanks a lot for this. The code runs for me, so that's great! Two questions:

  1. What happens in case the statistical uncertainties are already zero by construction? This happens whenever the total experimental correlation includes both systematics and statistical uncertainties. I guess there is nothing much we can do in that case?
  2. I did a quick test by putting all WCs to zero in the run card to see whether the central values remained the same. I observed some small differences. I tried with the dilepton dataset for example. Do you reproduce this? Or am I missing something obvious?
juanrojochacon commented 1 year ago

Hi @jacoterh concerning your questions:

tgiani commented 1 year ago

Hi @jacoterh 1) just as @juanrojochacon said. If you use the code with one of these dataset you ll just get back something which has again stat uncertainty equal to 0 (since it is included in the sys part, as you said) 2) again as @juanrojochacon said. The cv are by default generated using the SM prediction in the theory tables, so if you put all the WCs to 0 (or you don t specify them in the runcard) you get the sm. Which dataset have you used to test?

jacoterh commented 1 year ago

Thanks @juanrojochacon and @tgiani, that clarifies my questions! I do retrieve the SM predictions with all WCs to zero and in the absence of stat. unc. I just did not realise soon enough that it generates the pseudo data starting from the theory. All good!

jacoterh commented 12 months ago

Some suggestions that came to mind while working with a scaled up version:

  1. The rescaling factor now specified by for example -r 0.1 should be updated to handle non-uniform rescalings. At HL-LHC, the luminosity is 3 ab^-1 for all datasets, while not all datasets are taken at the same lumi before the projection. So we need different rescaling factors depending on the original lumi.
  2. In view of point 1, we need to store the original lumi in the run card. Then we can use this info to project with syntax like smefit PROJ -lumi <x>. The rescaling factor used in 1 will be computed on the fly now and is dataset specific.
  3. We would like a dataset filter based on features like lumi, com, etc that lets us run fit variants with different types of datasets.
juanrojochacon commented 12 months ago

Some comments

jacoterh commented 6 months ago

Ready to merge @tgiani @giacomomagni

tgiani commented 6 months ago

looks good to me