Closed alecandido closed 7 months ago
Do we have evidence that there really is a problem with Grid::evolve
?
Do we have evidence that there really is a problem with
Grid::evolve
?
The problem is in the signature: https://github.com/NNPDF/pineappl/blob/fc50e9c5d785076968f52ad9b913e32c0415bfa5/pineappl/src/evolution.rs#L168
If you ask for a rank-5, that's the full one. And we know the size of the full EKO for these processes.
In particular, we need to pass instead a callable for ArrayView4
, and only when you call for that $Q^2$ element the object is loaded in memory. It is available in Python, but not available in Rust (however, it is just a matter of looking the correct .np{y,z}.lz4
file inside a tar folder, so not much different from what you have already done.
I understand that the whole thing will be big, but have you ever 1) run Grid::evolve
with a large EKO, 2) saw it really is slow and 3) verified the culprit is that you pass the entire operator? From your description it seems that you anticipate it being a performance problem, but so far it seems we don't have evidence for it.
It is not a performance problem, just a memory consumption problem.
Passing a 5-dim operator constructed from the 4-dim ones, or a loader to 4-dim operator, will have exactly the same performance (unless that there are more pieces involved, since you have to join them in Python and might be slower). However, no one is concerned about this performance differences, but only that the 5-dim object might be too big for some specific data sets on not too powerful machines, without a good reason to consume all that memory.
OK, I see! In that case, a straightforward optimization would need functions to perform the following operations:
(pid1, pid0)
combinations have a non-zero operator? This is currently done here for the indices of the pids: https://github.com/NNPDF/pineappl/blob/fc50e9c5d785076968f52ad9b913e32c0415bfa5/pineappl/src/evolution.rs#L102-L118
However, this going over the entire operator and I think after generation EKO should 'know' this.muf2
- and x1
-axes. These 3D operators I think we can safely pass around without any memory problems.We are more or less slicing the other way round, as described above ($\mu_F^2 \mapsto O_4$).
The reason is that different scales are really independent, while in flavor basis in general there are no non-zero pairs (pid0, pid1)
.
Indeed, the operator is block-diagonal in evolution basis (QCD: singlet/gluon + diagonal, QCD+QED: singlet-u/singlet-d/gluon/photon + valence-u/valence-d + diagonal), and once you rotate even a single basis (input or output) to flavor basis, you have all entries filled.
The only special case is intrinsic: if you evolve charm below its matching scale (usually charm mass) then the evolution is the identity in flavor basis. But this is really the only case.
Moreover, in order to construct a slice containing the elements related to all $\mu_F^2$ values you have to load the full EKO (even though not all at once, so no memory problem, just inefficient because you load many times the same files from the disk - if you cache them, you have back the memory issue).
We are more or less slicing the other way round, as described above (μF2↦O4).
This can be done as well, but I'm not sure it's as efficient, but we'd simply have to try it out.
The reason is that different scales are really independent, while in flavor basis in general there are no non-zero pairs
(pid0, pid1)
. Indeed, the operator is block-diagonal in evolution basis (QCD: singlet/gluon + diagonal, QCD+QED: singlet-u/singlet-d/gluon/photon + valence-u/valence-d + diagonal), and once you rotate even a single basis (input or output) to flavor basis, you have all entries filled.
That's not entirely correct: for Nf=5 PDFs there are no top quarks at both process and fitting scale (PIDs -6
and 6
), often there's no photon and at fitting scale there are no bottom quarks (PIDs -5
and 5
). For some PDFs there aren't even charm quarks! For NNPDF4.0 we can decrease the dimensions from 14x14 to 9x11, and that's a reduction of the operator size of 50%!
Ok, you're right.
But for most of the processes even bottom is there, it is just a fraction of DIS data that do not have the bottom, and almost always not for every value of $\mu^2$, so you can't drop the slice.
The further limitations are true only when fitting a specific PDF set (e.g. Nf3
/Nf4
/...), but these are considered variations in NNPDF (and mostly also by the other collaborations), so you only have a couple of them in a full release, and you usually develop only on the VFNS.
Moreover, as soon as we will have QED (that is decided to be almost done) the default fits and evolution will include the photon. So you will have all flavors but bottom and top at the input scale, and all flavors but top at, at least, one output scale. So your typical operator will be 10x12 (for as long as someone won't propose to fit the bottom, but for that we have time, since we're still digesting $c^-$).
This can be done as well, but I'm not sure it's as efficient, but we'd simply have to try it out.
This is extremely convenient to evolve PDFs, and the only way to compute an EKO (since we can solve DGLAP only for one final scale at a time, but all x values and flavors at the same time).
However, that EKO loader will be available anyhow, with the $\mu^2 \mapsto O_4$ layout, because it's the one matching the new output format. If you wish, once it will be ready, you'll be able to test the most convenient way (and decide if using as it is, or reconstruct the $O_3$ slices, joining different final scales).
@AleCandido I thought a bit more about the problem of what interface we should have for this, and one idea would be to have a Grid::evolve_slice
. This is basically Grid::evolve
, but only evolves one (factorization-scale) slice of the Grid
, corresponding to a 4-dimensional EKO (slice), and ignores everything else. Then the user can combine Grid::evolve_slice
and Grid::merge
to sum every evolved slice. In this way we should be able to avoid the questions about callbacks. We can also rewrite Grid::evolve
in terms of Grid::evolve_slice
. What do you think?
I believe it's a very good solution. The spirit was actually to consume them one by one, and I just never thought that the grid were sliced as well.
For completeness: sliced EKOs will be the only EKOs from now on, so you have to merge them anyhow to feed Grid::evolve
. I would propose to even deprecate it, and eventually we could make it a wrapper for Grid::evolve_slice
the moment the EKO Rust loader will be available.
The only unpleasant bit is that we should still improve Grid::merge
to support small EKOs convolution fully in memory from Python (without intermediate grid write), because at the moment Python has only access to merge_from_file
.
https://github.com/NNPDF/pineappl/blob/c756b4ca839804212c8432d9f2a5491e9d6e6202/pineappl_py/src/grid.rs#L614-L618
Grid
is Clone
now, so I don't think there's any showstopper of exposing Grid::merge
to the Python interface.
Grid
isClone
now, so I don't think there's any showstopper of exposingGrid::merge
to the Python interface.
Ah, great. I just didn't notice it happened. We can add immediately Grid.merge
to the Python interface.
PR #244 will implement the necessary changes.
As required in https://github.com/NNPDF/pineappl/issues/185#issuecomment-1439662372 I'm opening a new issue for the problem.
The problem is not a problem at the moment, since we only have a memory problem with the EKO for the whole NNPDF4.0 data set altogether, but we always split by process. And those who require the biggest EKOs are jets, which require an EKO of ~4GB.
However, at the moment PineAPPL needs to load the whole 4GB or whatever they are. But this is only a problem in PineAPPL, since EKO can break them and load one piece at a time (i.e. one scale at a time).
Actually, the format is already available. As soon as I will have solved the current "top-priority", I will step in and solve myself, providing a lazy-loader to reduce memory consumption (based on the current EKO loader and the new format).