Open cschwan opened 2 years ago
CC @felixhekhorn @AleCandido @scarlehoff @Radonirinaunimi
That's a perfect solution for me.
The only additional proposal is that, since PineAPPL grids metadata are always strings, I would make the nuclear_model_x
valid JSON, for parsing simplicity. E.g.: for deuteron
nuclear_model_1: {"A": 1, "Z": 1}
Thanks a lot @cschwan for this. This proposal also is perfect for me (this would make my life infinitely easier).
The only additional proposal is that, since PineAPPL grids metadata are always strings, I would make the
nuclear_model_x
valid JSON, for parsing simplicity. E.g.: for deuteronnuclear_model_1: {"A": 1, "Z": 1}
I also like very much this way of representing the metadata which would also make validphys
very happy.
@AleCandido @Radonirinaunimi : yes, let's do that!
For the time being (to get the ball rolling) I will write a script to basically burn the metadata a posteriori. Basically instead of doing it like in this PR https://github.com/NNPDF/nnpdf/pull/1632 (where the information is put in the fit runcard) the relevant theory will be modified to contain the metadata as discussed in this issue.
That way when this is implemented in pineappl (issue #118?) the number of changes in vp will be minimal (maybe the way in which the information is retrieved is changed, but nothing beyond that).
Another problem that we should keep in mind that
both unfortunately have the same PDG number, and that leads to potential problems in Grid::optimize
; it assumes that all protons are equal, and that isn't the case here, clearly. @Radonirinaunimi this might be a problem that you've already stumbled over.
Another problem that we should keep in mind that
* real protons and * 'protons as the average nucleus' in nuclei
both unfortunately have the same PDG number, and that leads to potential problems in
Grid::optimize
; it assumes that all protons are equal, and that isn't the case here, clearly. @Radonirinaunimi this might be a problem that you've already stumbled over.
@cschwan Is this a problem at the level of the storing of the partonic bits or at the level of the convolution? I am not sure if the following applies to the above but usually the way I've dealt with the two different scenarios so far is to always generate grids for the real/free protons and account for the isospin asymmetry later.
Let's say you have a proton-lead collision, and generate your grid using initial_state_1
and initial_state_2
set to 2212
. In that case you should generate a grid where, for instance, u u~ is treated differently from u~ u because both quarks come from different hadrons. This becomes a problem when you optimize the grid because PineAPPL sees that the initial-state PDG IDs are both 2212
, and therefore symmetrizes by merging u~ u into u u~. However, this is wrong, because the two 'protons' aren't the same. This could for instance mean that for DY all quarks come from the first hadron, and all anti-quarks from the second hadron. If the two hadrons aren't actually the same you'll get wrong numbers.
Practically you can check this by doing your analyses with your default grid and one where you make sure that it's not optimized.
That's right! But there is actually a way around this which was the procedure that has been adopted by nNNPDF in the previous releases. That is, the grids are always generated using $ep$ or $pp$ and to get to $eA$ or $pA$ one convolutes the grids with:
$$ f^A(x) = Z f^{p/A}(x) + (A - Z) f^{n/A}(x)$$
with $f^{p/A}(x)$ and $f^{n/A}(x)$ denoting the proton- and neutron-bound PDFs respectively. Doing so ofc assumes that all the nuclear datasets are corrected for isoscalarity even if $A \neq 2Z$.
@Radonirinaunimi if it's a problem, it would only be one for $ p A $ generated using $ p p $. In that case you should probably shouldn't optimize.
@Radonirinaunimi if it's a problem, it would only be one for $ p A $ generated using $ p p $. In that case you should probably shouldn't optimize.
However, this would a problem with the Pineline, since at some point optimize()
is called (in Pineko I believe, while in Pinefarm it should be up to the selected external implementation).
I agree. I think we should start investigating the size of the problem.
just to echo the discussion from #265 and to summarize the situation: we need to replace initial_state_1
with a more sophisticated structure, which states:
This has been mostly implemented in https://github.com/NNPDF/pineappl/pull/287. For general nuclei we could add a new type in Convolution
that specifies A
and N
.
PineAPPL's metadata contains the keys
initial_state_1
andinitial_state_2
, which are the PDG Monte Carlo IDs of the PDFs that the grid must be convolved with.However, I now realise that the name is confusing, because the PDFs not necessarily agree with the actual hadronic initial states. For instance, when we use processes which collide lead nuclei with protons, but we want to fit only the proton, yadism for instance will convert the lead nuclei using isospin-symmetry to an 'effective proton in lead' so that
initial_state_1 = 2212
andinitial_state_2 = 2212
.I therefore suggest that we replace
initial_state_1
andinitial_state_2
withpdf1
andpdf2
, and add the following additional keys:in1
: the actual initial state 1 of the collision as a PDF MC IDin2
: same for the initial state 2If
in1
andpdf1
and/orin2
andpdf2
are different, this means that the situation above is true and that inside the grid we make assumptions about the 'nuclear model'; for this we should document the atomic number (A
) and number of protons (Z
), for instance in the following way:nuclear_model_1: A=1,Z=1
for deuterons.
Furthermore, if we have leptons or other non-hadronic particles in the initial state, the corresponding
pdf1
/pdf2
should be present but emtpy.