Projections - Githubissues

tgiani commented 1 year ago

This PR should allow the user to create projection starting from an existing dataset. The central value should be given either by the sm prediction or by sm + a set of EFT correction specified by the user.

codecov[bot] commented 1 year ago

Codecov Report

Attention: Patch coverage is 0% with 72 lines in your changes are missing coverage. Please review.

Project coverage is 41.95%. Comparing base (2f13519) to head (46ff994). Report is 11 commits behind head on main.

:exclamation: Current head 46ff994 differs from pull request most recent head 9434bc3. Consider uploading reports for the commit 9434bc3 to get more accurate results

Additional details and impacted files

[![Impacted file tree graph](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/57/graphs/tree.svg?width=650&height=150&src=pr&token=MRTEXUP8XU&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef)](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/57?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef) ```diff @@ Coverage Diff @@ ## main #57 +/- ## ========================================== - Coverage 42.01% 41.95% -0.06% ========================================== Files 29 28 -1 Lines 2378 2255 -123 ========================================== - Hits 999 946 -53 + Misses 1379 1309 -70 ``` | [Flag](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/57/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/57/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef) | `41.95% <0.00%> (-0.06%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef#carryforward-flags-in-the-pull-request-comment) to find out more. | [Files](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/57?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef) | Coverage Δ | | |---|---|---| | [src/smefit/cli/\_\_init\_\_.py](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/57?src=pr&el=tree&filepath=src%2Fsmefit%2Fcli%2F__init__.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef#diff-c3JjL3NtZWZpdC9jbGkvX19pbml0X18ucHk=) | `0.00% <0.00%> (ø)` | | | [src/smefit/projections/\_\_init\_\_.py](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/57?src=pr&el=tree&filepath=src%2Fsmefit%2Fprojections%2F__init__.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef#diff-c3JjL3NtZWZpdC9wcm9qZWN0aW9ucy9fX2luaXRfXy5weQ==) | `0.00% <0.00%> (ø)` | | ... and [11 files with indirect coverage changes](https://app.codecov.io/gh/LHCfitNikhef/smefit_release/pull/57/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LHCfitNikhef)

tgiani commented 1 year ago

@jacoterh @giacomomagni There is still some cleaning to be done and I should add some tests and update the docs, but if you have time please start having a look. The idea is that you have a new runcard runcards/projection.yaml where you specify the info needed to build the projection and then you can use smefit PROJ runcards/projection.yaml -r 0.1 where 0.1 is the factor you want to use to reduce the stat error. In the runcard you have to specify the datasets for which you want to build projections and and the values of the wilson coeff you want to use for the central values (in case you want to build central values which are not sm-like). The code will create a new folder projection where the new dataset is saved.

jacoterh commented 1 year ago

Hi @tgiani, thanks a lot for this. The code runs for me, so that's great! Two questions:

What happens in case the statistical uncertainties are already zero by construction? This happens whenever the total experimental correlation includes both systematics and statistical uncertainties. I guess there is nothing much we can do in that case?
I did a quick test by putting all WCs to zero in the run card to see whether the central values remained the same. I observed some small differences. I tried with the dilepton dataset for example. Do you reproduce this? Or am I missing something obvious?

juanrojochacon commented 1 year ago

Hi @jacoterh concerning your questions:

Experiments which only provide the full cov mat, and do not provide separation between statistical and systematic errors, are not amenable for projections. So we can just forget about them. We can only apply our projection strategy to datasets that provide the explicit breakdown between systematic and statistical errors.
The pseudo-data is generated assuming the SM. For this reason, the central value of the generated pseudo-data will be different from the one of the original measurements, first because it is based on theory, second because a layer of stat fluctuations are added on top. @tgiani can confirm!

tgiani commented 1 year ago

Hi @jacoterh 1) just as @juanrojochacon said. If you use the code with one of these dataset you ll just get back something which has again stat uncertainty equal to 0 (since it is included in the sys part, as you said) 2) again as @juanrojochacon said. The cv are by default generated using the SM prediction in the theory tables, so if you put all the WCs to 0 (or you don t specify them in the runcard) you get the sm. Which dataset have you used to test?

jacoterh commented 1 year ago

Thanks @juanrojochacon and @tgiani, that clarifies my questions! I do retrieve the SM predictions with all WCs to zero and in the absence of stat. unc. I just did not realise soon enough that it generates the pseudo data starting from the theory. All good!

jacoterh commented 12 months ago

Some suggestions that came to mind while working with a scaled up version:

The rescaling factor now specified by for example -r 0.1 should be updated to handle non-uniform rescalings. At HL-LHC, the luminosity is 3 ab^-1 for all datasets, while not all datasets are taken at the same lumi before the projection. So we need different rescaling factors depending on the original lumi.
In view of point 1, we need to store the original lumi in the run card. Then we can use this info to project with syntax like smefit PROJ -lumi <x>. The rescaling factor used in 1 will be computed on the fly now and is dataset specific.
We would like a dataset filter based on features like lumi, com, etc that lets us run fit variants with different types of datasets.

juanrojochacon commented 12 months ago

Some comments

Yes indeed, the rescaling factor will be different in each dataset. More in general, one can even think of different rescaling factors for different systematics, but this is likely an overkill
Good idea to store the lumi in the run card
As mentioned, this projections only apply for the HL-LHC data. For the FCC-ee measurements, we can reuse already existing projections

jacoterh commented 6 months ago

Ready to merge @tgiani @giacomomagni

tgiani commented 6 months ago

looks good to me

LHCfitNikhef / smefit_release

Projections #57

Codecov Report