Closed RosalynLP closed 5 years ago
ISTM that what we want is to first collect results
over dataspecs instead of:
/n/nnpdf (prescrip2 %) $ validphys --help results_bytheoryids
results_bytheoryids
Defined in: reportengine.resourcebuilder
results_bytheoryids()
The result of `results` for each in ('theoryids',).
and then everything else should follow (either with a fair amount of duplicated functions that only call the old functions or with NNPDF/reportengine#63 ).
Note that we already have datspecs_results
I think the runcard should look something like this. Please note there is this annoying bug at the moment https://github.com/NNPDF/reportengine/issues/16
fit: XXX
use_cuts: "fromfit"
pdf: YYY
dataspecs:
- theoryid: ZZZ
experiments: ... # Probably has to go here for now. Sorry!
- theoryid: XYXY
experiments: ...
...
and then the actions would collect over:
matched_datasets_from_datapsecs::datasepecs_with_matched_cuts::datapsecs_results
or something like that. Note that this is the same as the shift matrix business.
Any progress with this? Any problems I could help with?
Yes I'm still having problems with the runcards. Currently I have:
meta:
author: Rosalyn Pearson
keywords: [test, theory uncertainties, matched cuts]
title: Testing theory covariance matrix with matched cuts
default_theory:
- theoryid: 163
fivetheories: nobar
theoryids:
- 163
- 177
- 176
- 179
- 174
# - 180
# - 173
# - 175
# - 178
dataspecs:
- theoryid: 163
speclabel: $(\xi_F,\xi_R)=(1,1)$
- experiments:
# Fixed target DIS
- experiment: NMC
datasets:
- dataset: NMCPD
- dataset: NMC
- experiment: SLAC
datasets:
- dataset: SLACP
- dataset: SLACD
- experiment: BCDMS
datasets:
- dataset: BCDMSP
- dataset: BCDMSD
- experiment: NTVDMN
datasets:
- dataset: NTVNUDMN
- dataset: NTVNBDMN
- experiment: CHORUS
datasets:
- dataset: CHORUSNU
- dataset: CHORUSNB
# Combined HERA charm production cross-sections
- experiment: HERAF2CHARM
datasets:
- dataset: HERAF2CHARM
# HERA data
- experiment: HERACOMB
datasets:
- dataset: HERACOMBNCEM
- dataset: HERACOMBNCEP460
- dataset: HERACOMBNCEP575
- dataset: HERACOMBNCEP820
- dataset: HERACOMBNCEP920
- dataset: HERACOMBCCEM
- dataset: HERACOMBCCEP
# F2bottom data
- experiment: F2BOTTOM
datasets:
- dataset: H1HERAF2B
- dataset: ZEUSHERAF2B
- experiment: ATLAS
datasets:
- dataset: ATLASWZRAP36PB
- dataset: ATLASZHIGHMASS49FB
- dataset: ATLASLOMASSDY11EXT
- dataset: ATLASWZRAP11
- dataset: ATLAS1JET11
- dataset: ATLASZPT8TEVMDIST
- dataset: ATLASZPT8TEVYDIST
- dataset: ATLASTTBARTOT
- dataset: ATLASTOPDIFF8TEVTRAPNORM
- experiment: CMS
datasets:
- dataset: CMSWEASY840PB
- dataset: CMSWMASY47FB
- dataset: CMSWCHARMRAT
- dataset: CMSDY2D11
- dataset: CMSWMU8TEV
- dataset: CMSJETS11
- dataset: CMSTTBARTOT
- dataset: CMSTOPDIFF8TEVTTRAPNORM
- experiment: LHCb
datasets:
- dataset: LHCBZ940PB
- dataset: LHCBZEE2FB
- experiment: CDF
datasets:
- dataset: CDFZRAP
- dataset: CDFR2KT
- experiment: D0
datasets:
- dataset: D0ZRAP
- dataset: D0WEASY
- dataset: D0WMASY
- theoryid: 177
speclabel: $(\xi_F,\xi_R)=(2,1)$
- experiments:
# Fixed target DIS
- experiment: NMC
datasets:
- dataset: NMCPD
- dataset: NMC
- experiment: SLAC
datasets:
- dataset: SLACP
- dataset: SLACD
- experiment: BCDMS
datasets:
- dataset: BCDMSP
- dataset: BCDMSD
- experiment: NTVDMN
datasets:
- dataset: NTVNUDMN
- dataset: NTVNBDMN
- experiment: CHORUS
datasets:
- dataset: CHORUSNU
- dataset: CHORUSNB
# Combined HERA charm production cross-sections
- experiment: HERAF2CHARM
datasets:
- dataset: HERAF2CHARM
# HERA data
- experiment: HERACOMB
datasets:
- dataset: HERACOMBNCEM
- dataset: HERACOMBNCEP460
- dataset: HERACOMBNCEP575
- dataset: HERACOMBNCEP820
- dataset: HERACOMBNCEP920
- dataset: HERACOMBCCEM
- dataset: HERACOMBCCEP
# F2bottom data
- experiment: F2BOTTOM
datasets:
- dataset: H1HERAF2B
- dataset: ZEUSHERAF2B
- experiment: ATLAS
datasets:
- dataset: ATLASWZRAP36PB
- dataset: ATLASZHIGHMASS49FB
- dataset: ATLASLOMASSDY11EXT
- dataset: ATLASWZRAP11
- dataset: ATLAS1JET11
- dataset: ATLASZPT8TEVMDIST
- dataset: ATLASZPT8TEVYDIST
- dataset: ATLASTTBARTOT
- dataset: ATLASTOPDIFF8TEVTRAPNORM
- experiment: CMS
datasets:
- dataset: CMSWEASY840PB
- dataset: CMSWMASY47FB
- dataset: CMSWCHARMRAT
- dataset: CMSDY2D11
- dataset: CMSWMU8TEV
- dataset: CMSJETS11
- dataset: CMSTTBARTOT
- dataset: CMSTOPDIFF8TEVTTRAPNORM
- experiment: LHCb
datasets:
- dataset: LHCBZ940PB
- dataset: LHCBZEE2FB
- experiment: CDF
datasets:
- dataset: CDFZRAP
- dataset: CDFR2KT
- experiment: D0
datasets:
- dataset: D0ZRAP
- dataset: D0WEASY
- dataset: D0WMASY
- theoryid: 176
speclabel: $(\xi_F,\xi_R)=(0.5,1)$
- experiments:
# Fixed target DIS
- experiment: NMC
datasets:
- dataset: NMCPD
- dataset: NMC
- experiment: SLAC
datasets:
- dataset: SLACP
- dataset: SLACD
- experiment: BCDMS
datasets:
- dataset: BCDMSP
- dataset: BCDMSD
- experiment: NTVDMN
datasets:
- dataset: NTVNUDMN
- dataset: NTVNBDMN
- experiment: CHORUS
datasets:
- dataset: CHORUSNU
- dataset: CHORUSNB
# Combined HERA charm production cross-sections
- experiment: HERAF2CHARM
datasets:
- dataset: HERAF2CHARM
# HERA data
- experiment: HERACOMB
datasets:
- dataset: HERACOMBNCEM
- dataset: HERACOMBNCEP460
- dataset: HERACOMBNCEP575
- dataset: HERACOMBNCEP820
- dataset: HERACOMBNCEP920
- dataset: HERACOMBCCEM
- dataset: HERACOMBCCEP
# F2bottom data
- experiment: F2BOTTOM
datasets:
- dataset: H1HERAF2B
- dataset: ZEUSHERAF2B
- experiment: ATLAS
datasets:
- dataset: ATLASWZRAP36PB
- dataset: ATLASZHIGHMASS49FB
- dataset: ATLASLOMASSDY11EXT
- dataset: ATLASWZRAP11
- dataset: ATLAS1JET11
- dataset: ATLASZPT8TEVMDIST
- dataset: ATLASZPT8TEVYDIST
- dataset: ATLASTTBARTOT
- dataset: ATLASTOPDIFF8TEVTRAPNORM
- experiment: CMS
datasets:
- dataset: CMSWEASY840PB
- dataset: CMSWMASY47FB
- dataset: CMSWCHARMRAT
- dataset: CMSDY2D11
- dataset: CMSWMU8TEV
- dataset: CMSJETS11
- dataset: CMSTTBARTOT
- dataset: CMSTOPDIFF8TEVTTRAPNORM
- experiment: LHCb
datasets:
- dataset: LHCBZ940PB
- dataset: LHCBZEE2FB
- experiment: CDF
datasets:
- dataset: CDFZRAP
- dataset: CDFR2KT
- experiment: D0
datasets:
- dataset: D0ZRAP
- dataset: D0WEASY
- dataset: D0WMASY
- theoryid: 179
speclabel: $(\xi_F,\xi_R)=(1,2)$
- experiments:
# Fixed target DIS
- experiment: NMC
datasets:
- dataset: NMCPD
- dataset: NMC
- experiment: SLAC
datasets:
- dataset: SLACP
- dataset: SLACD
- experiment: BCDMS
datasets:
- dataset: BCDMSP
- dataset: BCDMSD
- experiment: NTVDMN
datasets:
- dataset: NTVNUDMN
- dataset: NTVNBDMN
- experiment: CHORUS
datasets:
- dataset: CHORUSNU
- dataset: CHORUSNB
# Combined HERA charm production cross-sections
- experiment: HERAF2CHARM
datasets:
- dataset: HERAF2CHARM
# HERA data
- experiment: HERACOMB
datasets:
- dataset: HERACOMBNCEM
- dataset: HERACOMBNCEP460
- dataset: HERACOMBNCEP575
- dataset: HERACOMBNCEP820
- dataset: HERACOMBNCEP920
- dataset: HERACOMBCCEM
- dataset: HERACOMBCCEP
# F2bottom data
- experiment: F2BOTTOM
datasets:
- dataset: H1HERAF2B
- dataset: ZEUSHERAF2B
- experiment: ATLAS
datasets:
- dataset: ATLASWZRAP36PB
- dataset: ATLASZHIGHMASS49FB
- dataset: ATLASLOMASSDY11EXT
- dataset: ATLASWZRAP11
- dataset: ATLAS1JET11
- dataset: ATLASZPT8TEVMDIST
- dataset: ATLASZPT8TEVYDIST
- dataset: ATLASTTBARTOT
- dataset: ATLASTOPDIFF8TEVTRAPNORM
- experiment: CMS
datasets:
- dataset: CMSWEASY840PB
- dataset: CMSWMASY47FB
- dataset: CMSWCHARMRAT
- dataset: CMSDY2D11
- dataset: CMSWMU8TEV
- dataset: CMSJETS11
- dataset: CMSTTBARTOT
- dataset: CMSTOPDIFF8TEVTTRAPNORM
- experiment: LHCb
datasets:
- dataset: LHCBZ940PB
- dataset: LHCBZEE2FB
- experiment: CDF
datasets:
- dataset: CDFZRAP
- dataset: CDFR2KT
- experiment: D0
datasets:
- dataset: D0ZRAP
- dataset: D0WEASY
- dataset: D0WMASY
- theoryid: 174
speclabel: $(\xi_F,\xi_R)=(1,0.5)$
- experiments:
# Fixed target DIS
- experiment: NMC
datasets:
- dataset: NMCPD
- dataset: NMC
- experiment: SLAC
datasets:
- dataset: SLACP
- dataset: SLACD
- experiment: BCDMS
datasets:
- dataset: BCDMSP
- dataset: BCDMSD
- experiment: NTVDMN
datasets:
- dataset: NTVNUDMN
- dataset: NTVNBDMN
- experiment: CHORUS
datasets:
- dataset: CHORUSNU
- dataset: CHORUSNB
# Combined HERA charm production cross-sections
- experiment: HERAF2CHARM
datasets:
- dataset: HERAF2CHARM
# HERA data
- experiment: HERACOMB
datasets:
- dataset: HERACOMBNCEM
- dataset: HERACOMBNCEP460
- dataset: HERACOMBNCEP575
- dataset: HERACOMBNCEP820
- dataset: HERACOMBNCEP920
- dataset: HERACOMBCCEM
- dataset: HERACOMBCCEP
# F2bottom data
- experiment: F2BOTTOM
datasets:
- dataset: H1HERAF2B
- dataset: ZEUSHERAF2B
- experiment: ATLAS
datasets:
- dataset: ATLASWZRAP36PB
- dataset: ATLASZHIGHMASS49FB
- dataset: ATLASLOMASSDY11EXT
- dataset: ATLASWZRAP11
- dataset: ATLAS1JET11
- dataset: ATLASZPT8TEVMDIST
- dataset: ATLASZPT8TEVYDIST
- dataset: ATLASTTBARTOT
- dataset: ATLASTOPDIFF8TEVTRAPNORM
- experiment: CMS
datasets:
- dataset: CMSWEASY840PB
- dataset: CMSWMASY47FB
- dataset: CMSWCHARMRAT
- dataset: CMSDY2D11
- dataset: CMSWMU8TEV
- dataset: CMSJETS11
- dataset: CMSTTBARTOT
- dataset: CMSTOPDIFF8TEVTTRAPNORM
- experiment: LHCb
datasets:
- dataset: LHCBZ940PB
- dataset: LHCBZEE2FB
- experiment: CDF
datasets:
- dataset: CDFZRAP
- dataset: CDFR2KT
- experiment: D0
datasets:
- dataset: D0ZRAP
- dataset: D0WEASY
- dataset: D0WMASY
# - theoryid: 180
# speclabel: $(\xi_F,\xi_R)=(2,2)$
# - theoryid: 173
# speclabel: $(\xi_F,\xi_R)=(0.5,0.5)$
# - theoryid: 175
# speclabel: $(\xi_F,\xi_R)=(2,0.5)$
# - theoryid: 178
# speclabel: $(\xi_F,\xi_R)=(0.5,2)$
normalize_to: 1
use_cuts: 'fromfit'
fit: NNPDF31_nlo_as_0118_1000
pdf:
from_: fit
#template_text: |
#
# {@with default_theory@}
#
# {@plot_thcorrmat_heatmap_custom@}
#
# {@endwith@}
actions_:
# - report(main=true)
- matched_datasets_from_dataspecs::dataspecs_with_matched_cuts::dataspecs_results plot_thcorrmat_heatmap_custom
and I am getting the error
[ERROR]: Bad configuration encountered:
A parameter is required: theoryid.
This is needed to process:
- experiments
trough:
- dataspecs
trough:
- matched_datasets_from_dataspecs
trough:
- ()
trough:
- plot_thcorrmat_heatmap_custom
Maybe you mistyped theoryid in one of the following keys?
- theoryids
- fivetheories
I also don't really understand whether this action I am doing is the right thing - I haven't yet altered anything in the code either as I am not really able to debug without getting a basic runcard to work.
sorry that was an accident
Note the runcard above is wrong it that it has the structure:
dataspecs: [
{theoryid: ...},
{experiments: ...},
{theoryid: ...},
{experiments: ...},
...
]
rather that:
dataspecs: [
{experimnts: ..., theoryid: ...},
{experimnts: ..., theoryid: ...},
...
]
which is what the error message is telling you.
I don't understand sorry, I tried taking the '-' away from the start of "experiments" but this didn't help
What does didn't help mean? I don't think it can give the same error.
Ah, initially I left one with a dash by accident but now it says
[ERROR]: Bad configuration encountered:
A parameter is required: dataspecs_results.
This is needed to process:
- (('matched_datasets_from_dataspecs', 0), ('dataspecs_with_matched_cuts', 0))
trough:
- plot_thcorrmat_heatmap_custom
Maybe you mistyped dataspecs_results in one of the following keys?
- dataspecs
This is because datapsecs_results
is not something you are supposed to expand namespaces over, but rather something you are supposed to collect over (my earlier message wasn't all that clear in that regard). However it should be easy enough to look at how matched_datasets_shift_matrix
works and to the equivalent thing. Note that pretty much the only change is to call male_scale_covmat instead of computing the shifts.
I'm really confused, in that case what do I put in the runcard? What is wrong with teh current runcard?
Have a look at this runcard:
https://vp.nnpdf.science/NlltmlyWRRqCtSeJbi1xIQ==/input/runcard.yaml
and and the corresponding code and try to work out how things get passed around (maybe run it with --debug
). Btw it is quite likely that it is affected by the reportengine bug and all the differences are due to changing the pdf...
Also we don't want to call make_scale_var_covmat right? Because that won't correlate between process types in the correct way. Ultimately we want to call theory_covmat_custom
but this is not easily equatable with matched_datasets_shift_matrix
, or at least I don't see how to write an equivalent (this is what I was trying to do earlier).
Sorry Zahari, this is the runcard I have been looking at most of the day and I just really don't understand it and can't get it to work properly for some reason
I just don't understand how to extend it to the point prescription case, I don't think it is an obvious extension
Incidentally ISTM that the runcard works well, which makes the bug in re even more confusing.
Wait what, the runcard I pasted above?
The one with the shift plots.
Ah no I mean I think I understand how the shift plots work but I just am having difficulty doing an equivalent because
a) I don't understand how to adjust the runcard b) I am not sure what to feed in to which new functions. What I am attempting is
matched_dataspecs_dataspecs_results = collect('dataspecs_results', ['dataspecs_with_matched_cuts'])
matched_datasets_matched_dataspecs_dataspecs_results = collect('matched_dataspecs_dataspecs_results', ['matched_datasets_from_dataspecs'])
Then writing a new combine_by_type
which takes matched_datasets_matched_dataspecs_dataspecs_results
rather than each_dataset_results_bytheory
but has no other changes.
Is this correct? Is there any part of this which is wrong?
ISTM that everything could be adapted more or less easily (but not trivially) by changing the namespaces the various actions collect over. E.g. this
results_bytheoryids = collect(results,('theoryids',))
each_dataset_results_bytheory = collect('results_bytheoryids', ('experiments', 'experiment'))
could become:
results_bytheoryids = collect(results,('dataspecs_with_matched_cuts',))
each_dataset_results_bytheory = collect('results_bytheoryids', ('matched_datasets_from_dataspecs'))
and then maybe you'll need some function to get the right dataframe index (I wrote the functionality inside some other provider).
@RosalynLP Yes, what you are doing seems like what I said.
OK great but I keep getting this problem:
[ERROR]: Bad configuration encountered:
A parameter is required: dataset_input.
This is needed to process:
- commondata
trough:
- report
trough:
- template_text
trough:
- plot_thcorrmat_heatmap_custom
trough:
- theory_corrmat_custom
trough:
- theory_covmat_custom
trough:
- covs_pt_prescrip
trough:
- combine_by_type
trough:
- process_lookup
trough:
- commondata_experiments
Initially I had
#commondata_experiments = collect('commondata', ['experiments', 'experiment'])
and I tried changing it to
commondata_experiments = collect('commondata',
('matched_datasets_from_dataspecs',))
but I still get the issue because of commondata
itself.
I only need the names of the experiments for this so I could take it from any dataspec but I am not sure how to do the syntax for this
OK I did this instead
commondata_experiments_sub = collect('commondata', ['dataspecs_with_matched_cuts'])
commondata_experiments = collect('commondata_experiments_sub',['matched_datasets_from_dataspecs'])
@Zaharid when you say "and then maybe you'll need some function to get the right dataframe index (I wrote the functionality inside some other provider)." I presume what you mean is the fact experiments_index
doesn't work and gives the error
[ERROR]: Bad configuration encountered:
A parameter is required: experiments.
This is needed to process:
- report
trough:
- template_text
trough:
- plot_thcorrmat_heatmap_custom
trough:
- theory_corrmat_custom
trough:
- theory_covmat_custom
trough:
- experiments_index
but I don't understand what the statement "I wrote the functionality inside some other provider" means - what are the different providers? So the issue is experiments_index
loads in the experiments before the cuts have been matched or something? What is the correct input rather than experiments
to this kind of function?
So we want to take in only the datasets which are mutual, i.e. those in matched_datasets_from_dataspecs
?
@Zaharid this whole thing makes no sense to me, even if I calculate the theory covmats using this runcard and matched_datsets_from_dataspecs
, it is taking the dataspecs to be the different scale varied dataspecs, not the two dataspecs for NLO and NNLO with NNPDF3.1. So I somehow want to have two kinds of groupings in the runcard, one for the scale varied dataspecs, and one for the shift dataspecs. We then need the shift dataspecs to do as your functions already do and compute the shift matrix, and we need the other dataspecs to do the theory covmat stuff which previously existed. But we want to do all that just for the points belonging to the matched datasets from the OTHER (shift) dataspecs. Do you know how to separate these two things?
On Wed, Oct 17, 2018 at 4:24 PM RosalynLP notifications@github.com wrote:
@Zaharid https://github.com/Zaharid this whole thing makes no sense to me, even if I calculate the theory covmats using this runcard and matched_datsets_from_dataspecs, it is taking the dataspecs to be the different scale varied dataspecs, not the two dataspecs for NLO and NNLO with NNPDF3.1. So I somehow want to have two kinds of groupings in the runcard, one for the scale varied dataspecs, and one for the shift dataspecs. We then need the shift dataspecs to do as your functions already do and compute the shift matrix, and we need the other dataspecs to do the theory covmat stuff which previously existed. But we want to do all that just for the points belonging to the matched datasets from the OTHER (shift) dataspecs. Do you know how to separate these two things?
This is can be solved with various kinds of namespaces:
shiftconfig: dataspecs:
thcovconfig: dataspecs:
shift_mat_for_comparison = collect('shift_matrix_whatever_was_called', ['shiftconfig']) th_mat_for_comparison = collect('thcovmat_custom_whatever', ['thcovmatconfig'])
def do_some_comparison(shift_mat_for_comparison, th_mat_for_comparison):
shift_mat = th_mat_for_comparison[0]
th_mat = th_mat_for_comparison[0]
...
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NNPDF/nnpdf/issues/305#issuecomment-430673533, or mute the thread https://github.com/notifications/unsubscribe-auth/AFabUnVxiSfzygCm5oJvSNf0OtadBvTtks5ul0uogaJpZM4Xd-Z5 .
mmm probably the fact that collect always returns a list is annoying enough to justify a collect_one or somesuch. Anyhow, lets get the thcovmat done first!
On Wed, Oct 17, 2018 at 4:51 PM Zahari Dim zaharid@gmail.com wrote:
On Wed, Oct 17, 2018 at 4:24 PM RosalynLP notifications@github.com wrote:
@Zaharid https://github.com/Zaharid this whole thing makes no sense to me, even if I calculate the theory covmats using this runcard and matched_datsets_from_dataspecs, it is taking the dataspecs to be the different scale varied dataspecs, not the two dataspecs for NLO and NNLO with NNPDF3.1. So I somehow want to have two kinds of groupings in the runcard, one for the scale varied dataspecs, and one for the shift dataspecs. We then need the shift dataspecs to do as your functions already do and compute the shift matrix, and we need the other dataspecs to do the theory covmat stuff which previously existed. But we want to do all that just for the points belonging to the matched datasets from the OTHER (shift) dataspecs. Do you know how to separate these two things?
This is can be solved with various kinds of namespaces:
shiftconfig: dataspecs:
- ... #nlo vs nnlo
thcovconfig: dataspecs:
- ... # bazillion theories
TODO: find better names
shift_mat_for_comparison = collect('shift_matrix_whatever_was_called', ['shiftconfig']) th_mat_for_comparison = collect('thcovmat_custom_whatever', ['thcovmatconfig'])
def do_some_comparison(shift_mat_for_comparison, th_mat_for_comparison):
because collect always returns a list
shift_mat = th_mat_for_comparison[0] th_mat = th_mat_for_comparison[0] ...
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NNPDF/nnpdf/issues/305#issuecomment-430673533, or mute the thread https://github.com/notifications/unsubscribe-auth/AFabUnVxiSfzygCm5oJvSNf0OtadBvTtks5ul0uogaJpZM4Xd-Z5 .
OK so when are the matched cuts being applied? Before this collect function presumably? In which case how do they know which dataspsecs to pick? Or does the fact you collect over a certain namespace have an effect? I basically still don't see that this will make the theory covmat have the matched cuts for the shift comparison.
NNPDF/nnpdf#309
I don't understand how the collect function is working here, for the shift matrix the workflow is essentially
matched_dataspecs_dataset_prediction_shift = collect(
'dataspecs_dataset_prediction_shift', ['matched_datasets_from_dataspecs'])
def matched_datasets_shift_matrix(matched_dataspecs_dataset_prediction_shift):
"""Priduce a matrix out of the outer product of
``dataspecs_dataset_prediction_shift``. The matrix will be a
pandas DataFrame, indexed similarly to ``experiments_index``."""
all_shifts = np.concatenate(
[val.shifts for val in matched_dataspecs_dataset_prediction_shift])
mat = np.outer(all_shifts, all_shifts)
#build index
expnames = np.concatenate([
np.full(len(val.shifts), val.experiment_name, dtype=object)
for val in matched_dataspecs_dataset_prediction_shift
])
dsnames = np.concatenate([
np.full(len(val.shifts), val.dataset_name, dtype=object)
for val in matched_dataspecs_dataset_prediction_shift
])
point_indexes = np.concatenate([
np.arange(len(val.shifts))
for val in matched_dataspecs_dataset_prediction_shift
])
index = pd.MultiIndex.from_arrays(
[expnames, dsnames, point_indexes],
names=["Experiment name", "Dataset name", "Point"])
return pd.DataFrame(mat, columns=index, index=index)
shift_mat_for_comparison = collect('matched_datasets_shift_matrix', ['shiftconfig'])
So I don't see how this works: first the matched_datasets_from_dataspecs
won't know which dataspecs to use, right, then even if that works you should end up with a list of matrices collected over the two theories NLO and NNLO or something, which makes no sense to me.
And then as for theories it won't know which dataspecs to use to evaluate the theory covmat, you'll end up with it using at best the mutual cuts from the scale varied theories, which aren't the same as for the NLO/NNLO mutual cuts, and then you will collect over all the theories, so end up with a list of matrices. But as far as I can see it will fail before this stage.
Regardless, I am having problems just getting the formatting on the runcard to work as it doesn't like all the different blocks:
Failed to parse yaml file: while parsing a block mapping
in "matched_test_notab.yaml", line 23, column 9
expected <block end>, but found '-'
in "matched_test_notab.yaml", line 100, column 9
meta:
author: Rosalyn Pearson
keywords: [test, theory uncertainties, matched cuts]
title: Testing theory covariance matrix with matched cuts
default_theory:
- theoryid: 163
fivetheories: nobar
theoryids:
- 163
- 177
- 176
- 179
- 174
# - 180
# - 173
# - 175
# - 178
thcovconfig:
dataspecs:
- theoryid: 163
speclabel: $(\xi_F,\xi_R)=(1,1)$
experiments:
# Fixed target DIS
- experiment: NMC
datasets:
- dataset: NMCPD
- dataset: NMC
- experiment: SLAC
datasets:
- dataset: SLACP
- dataset: SLACD
- experiment: BCDMS
datasets:
- dataset: BCDMSP
- dataset: BCDMSD
- experiment: NTVDMN
datasets:
- dataset: NTVNUDMN
- dataset: NTVNBDMN
- experiment: CHORUS
datasets:
- dataset: CHORUSNU
- dataset: CHORUSNB
# Combined HERA charm production cross-sections
- experiment: HERAF2CHARM
datasets:
- dataset: HERAF2CHARM
# HERA data
- experiment: HERACOMB
datasets:
- dataset: HERACOMBNCEM
- dataset: HERACOMBNCEP460
- dataset: HERACOMBNCEP575
- dataset: HERACOMBNCEP820
- dataset: HERACOMBNCEP920
- dataset: HERACOMBCCEM
- dataset: HERACOMBCCEP
# F2bottom data
- experiment: F2BOTTOM
datasets:
- dataset: H1HERAF2B
- dataset: ZEUSHERAF2B
- experiment: ATLAS
datasets:
- dataset: ATLASWZRAP36PB
- dataset: ATLASZHIGHMASS49FB
- dataset: ATLASLOMASSDY11EXT
- dataset: ATLASWZRAP11
- dataset: ATLAS1JET11
- dataset: ATLASZPT8TEVMDIST
- dataset: ATLASZPT8TEVYDIST
- dataset: ATLASTTBARTOT
- dataset: ATLASTOPDIFF8TEVTRAPNORM
- experiment: CMS
datasets:
- dataset: CMSWEASY840PB
- dataset: CMSWMASY47FB
- dataset: CMSWCHARMRAT
- dataset: CMSDY2D11
- dataset: CMSWMU8TEV
- dataset: CMSJETS11
- dataset: CMSTTBARTOT
- dataset: CMSTOPDIFF8TEVTTRAPNORM
- experiment: LHCb
datasets:
- dataset: LHCBZ940PB
- dataset: LHCBZEE2FB
- experiment: CDF
datasets:
- dataset: CDFZRAP
- dataset: CDFR2KT
- experiment: D0
datasets:
- dataset: D0ZRAP
- dataset: D0WEASY
- dataset: D0WMASY
- theoryid: 177
speclabel: $(\xi_F,\xi_R)=(2,1)$
experiments:
# Fixed target DIS
- experiment: NMC
datasets:
- dataset: NMCPD
- dataset: NMC
- experiment: SLAC
datasets:
- dataset: SLACP
- dataset: SLACD
- experiment: BCDMS
datasets:
- dataset: BCDMSP
- dataset: BCDMSD
- experiment: NTVDMN
datasets:
- dataset: NTVNUDMN
- dataset: NTVNBDMN
- experiment: CHORUS
datasets:
- dataset: CHORUSNU
- dataset: CHORUSNB
# Combined HERA charm production cross-sections
- experiment: HERAF2CHARM
datasets:
- dataset: HERAF2CHARM
# HERA data
- experiment: HERACOMB
datasets:
- dataset: HERACOMBNCEM
- dataset: HERACOMBNCEP460
- dataset: HERACOMBNCEP575
- dataset: HERACOMBNCEP820
- dataset: HERACOMBNCEP920
- dataset: HERACOMBCCEM
- dataset: HERACOMBCCEP
# F2bottom data
- experiment: F2BOTTOM
datasets:
- dataset: H1HERAF2B
- dataset: ZEUSHERAF2B
- experiment: ATLAS
datasets:
- dataset: ATLASWZRAP36PB
- dataset: ATLASZHIGHMASS49FB
- dataset: ATLASLOMASSDY11EXT
- dataset: ATLASWZRAP11
- dataset: ATLAS1JET11
- dataset: ATLASZPT8TEVMDIST
- dataset: ATLASZPT8TEVYDIST
- dataset: ATLASTTBARTOT
- dataset: ATLASTOPDIFF8TEVTRAPNORM
- experiment: CMS
datasets:
- dataset: CMSWEASY840PB
- dataset: CMSWMASY47FB
- dataset: CMSWCHARMRAT
- dataset: CMSDY2D11
- dataset: CMSWMU8TEV
- dataset: CMSJETS11
- dataset: CMSTTBARTOT
- dataset: CMSTOPDIFF8TEVTTRAPNORM
- experiment: LHCb
datasets:
- dataset: LHCBZ940PB
- dataset: LHCBZEE2FB
- experiment: CDF
datasets:
- dataset: CDFZRAP
- dataset: CDFR2KT
- experiment: D0
datasets:
- dataset: D0ZRAP
- dataset: D0WEASY
- dataset: D0WMASY
- theoryid: 176
speclabel: $(\xi_F,\xi_R)=(0.5,1)$
experiments:
# Fixed target DIS
- experiment: NMC
datasets:
- dataset: NMCPD
- dataset: NMC
- experiment: SLAC
datasets:
- dataset: SLACP
- dataset: SLACD
- experiment: BCDMS
datasets:
- dataset: BCDMSP
- dataset: BCDMSD
- experiment: NTVDMN
datasets:
- dataset: NTVNUDMN
- dataset: NTVNBDMN
- experiment: CHORUS
datasets:
- dataset: CHORUSNU
- dataset: CHORUSNB
# Combined HERA charm production cross-sections
- experiment: HERAF2CHARM
datasets:
- dataset: HERAF2CHARM
# HERA data
- experiment: HERACOMB
datasets:
- dataset: HERACOMBNCEM
- dataset: HERACOMBNCEP460
- dataset: HERACOMBNCEP575
- dataset: HERACOMBNCEP820
- dataset: HERACOMBNCEP920
- dataset: HERACOMBCCEM
- dataset: HERACOMBCCEP
# F2bottom data
- experiment: F2BOTTOM
datasets:
- dataset: H1HERAF2B
- dataset: ZEUSHERAF2B
- experiment: ATLAS
datasets:
- dataset: ATLASWZRAP36PB
- dataset: ATLASZHIGHMASS49FB
- dataset: ATLASLOMASSDY11EXT
- dataset: ATLASWZRAP11
- dataset: ATLAS1JET11
- dataset: ATLASZPT8TEVMDIST
- dataset: ATLASZPT8TEVYDIST
- dataset: ATLASTTBARTOT
- dataset: ATLASTOPDIFF8TEVTRAPNORM
- experiment: CMS
datasets:
- dataset: CMSWEASY840PB
- dataset: CMSWMASY47FB
- dataset: CMSWCHARMRAT
- dataset: CMSDY2D11
- dataset: CMSWMU8TEV
- dataset: CMSJETS11
- dataset: CMSTTBARTOT
- dataset: CMSTOPDIFF8TEVTTRAPNORM
- experiment: LHCb
datasets:
- dataset: LHCBZ940PB
- dataset: LHCBZEE2FB
- experiment: CDF
datasets:
- dataset: CDFZRAP
- dataset: CDFR2KT
- experiment: D0
datasets:
- dataset: D0ZRAP
- dataset: D0WEASY
- dataset: D0WMASY
- theoryid: 179
speclabel: $(\xi_F,\xi_R)=(1,2)$
experiments:
# Fixed target DIS
- experiment: NMC
datasets:
- dataset: NMCPD
- dataset: NMC
- experiment: SLAC
datasets:
- dataset: SLACP
- dataset: SLACD
- experiment: BCDMS
datasets:
- dataset: BCDMSP
- dataset: BCDMSD
- experiment: NTVDMN
datasets:
- dataset: NTVNUDMN
- dataset: NTVNBDMN
- experiment: CHORUS
datasets:
- dataset: CHORUSNU
- dataset: CHORUSNB
# Combined HERA charm production cross-sections
- experiment: HERAF2CHARM
datasets:
- dataset: HERAF2CHARM
# HERA data
- experiment: HERACOMB
datasets:
- dataset: HERACOMBNCEM
- dataset: HERACOMBNCEP460
- dataset: HERACOMBNCEP575
- dataset: HERACOMBNCEP820
- dataset: HERACOMBNCEP920
- dataset: HERACOMBCCEM
- dataset: HERACOMBCCEP
# F2bottom data
- experiment: F2BOTTOM
datasets:
- dataset: H1HERAF2B
- dataset: ZEUSHERAF2B
- experiment: ATLAS
datasets:
- dataset: ATLASWZRAP36PB
- dataset: ATLASZHIGHMASS49FB
- dataset: ATLASLOMASSDY11EXT
- dataset: ATLASWZRAP11
- dataset: ATLAS1JET11
- dataset: ATLASZPT8TEVMDIST
- dataset: ATLASZPT8TEVYDIST
- dataset: ATLASTTBARTOT
- dataset: ATLASTOPDIFF8TEVTRAPNORM
- experiment: CMS
datasets:
- dataset: CMSWEASY840PB
- dataset: CMSWMASY47FB
- dataset: CMSWCHARMRAT
- dataset: CMSDY2D11
- dataset: CMSWMU8TEV
- dataset: CMSJETS11
- dataset: CMSTTBARTOT
- dataset: CMSTOPDIFF8TEVTTRAPNORM
- experiment: LHCb
datasets:
- dataset: LHCBZ940PB
- dataset: LHCBZEE2FB
- experiment: CDF
datasets:
- dataset: CDFZRAP
- dataset: CDFR2KT
- experiment: D0
datasets:
- dataset: D0ZRAP
- dataset: D0WEASY
- dataset: D0WMASY
- theoryid: 174
speclabel: $(\xi_F,\xi_R)=(1,0.5)$
experiments:
# Fixed target DIS
- experiment: NMC
datasets:
- dataset: NMCPD
- dataset: NMC
- experiment: SLAC
datasets:
- dataset: SLACP
- dataset: SLACD
- experiment: BCDMS
datasets:
- dataset: BCDMSP
- dataset: BCDMSD
- experiment: NTVDMN
datasets:
- dataset: NTVNUDMN
- dataset: NTVNBDMN
- experiment: CHORUS
datasets:
- dataset: CHORUSNU
- dataset: CHORUSNB
# Combined HERA charm production cross-sections
- experiment: HERAF2CHARM
datasets:
- dataset: HERAF2CHARM
# HERA data
- experiment: HERACOMB
datasets:
- dataset: HERACOMBNCEM
- dataset: HERACOMBNCEP460
- dataset: HERACOMBNCEP575
- dataset: HERACOMBNCEP820
- dataset: HERACOMBNCEP920
- dataset: HERACOMBCCEM
- dataset: HERACOMBCCEP
# F2bottom data
- experiment: F2BOTTOM
datasets:
- dataset: H1HERAF2B
- dataset: ZEUSHERAF2B
- experiment: ATLAS
datasets:
- dataset: ATLASWZRAP36PB
- dataset: ATLASZHIGHMASS49FB
- dataset: ATLASLOMASSDY11EXT
- dataset: ATLASWZRAP11
- dataset: ATLAS1JET11
- dataset: ATLASZPT8TEVMDIST
- dataset: ATLASZPT8TEVYDIST
- dataset: ATLASTTBARTOT
- dataset: ATLASTOPDIFF8TEVTRAPNORM
- experiment: CMS
datasets:
- dataset: CMSWEASY840PB
- dataset: CMSWMASY47FB
- dataset: CMSWCHARMRAT
- dataset: CMSDY2D11
- dataset: CMSWMU8TEV
- dataset: CMSJETS11
- dataset: CMSTTBARTOT
- dataset: CMSTOPDIFF8TEVTTRAPNORM
- experiment: LHCb
datasets:
- dataset: LHCBZ940PB
- dataset: LHCBZEE2FB
- experiment: CDF
datasets:
- dataset: CDFZRAP
- dataset: CDFR2KT
- experiment: D0
datasets:
- dataset: D0ZRAP
- dataset: D0WEASY
- dataset: D0WMASY
# - theoryid: 180
# speclabel: $(\xi_F,\xi_R)=(2,2)$
# - theoryid: 173
# speclabel: $(\xi_F,\xi_R)=(0.5,0.5)$
# - theoryid: 175
# speclabel: $(\xi_F,\xi_R)=(2,0.5)$
# - theoryid: 178
# speclabel: $(\xi_F,\xi_R)=(0.5,2)$
shiftconfig:
dataspecs:
- theoryid: 52
pdf: NNPDF31_nlo_as_0118_hessian
speclabel: "NLO"
fit: NNPDF31_nlo_as_0118_1000
- theoryid: 53
pdf: NNPDF31_nnlo_as_0118_hessian
speclabel: "NNLO"
fit: NNPDF31_nnlo_as_0118_1000
normalize_to: 1
use_cuts: 'fromfit'
fit: NNPDF31_nlo_as_0118_1000
pdf:
from_: fit
template_text: |
{@with default_theory@}
{@plot_thcorrmat_heatmap_custom@}
{@endwith@}
{@with shiftconfig@}
{@plot_matched_datasets_shift_matrix@}
{@plot_matched_datasets_shift_matrix_correlations@}
{@endwith@}
actions_:
- report(main=true)
I am trying to work on creating theory covmats with cuts that match those of the shift matrices here: https://vp.nnpdf.science/NlltmlyWRRqCtSeJbi1xIQ==/.
Currently the theory covmats take in
so I have tried to alter them to take in
each_dataset_results_matched = collect('results_bytheoryids', ['dataspecs_with_matched_cuts']).
However, I am getting the error
and I am not sure why. Is this the way I should be trying to do this?