NNPDF / pineko

PineAPPL + EKO ─➤ fast theories
https://pineko.readthedocs.io
GNU General Public License v3.0
2 stars 0 forks source link

Cache of operators #173

Open scarlehoff opened 5 months ago

scarlehoff commented 5 months ago

At the moment in order to generate a theory we need to generate an insane amount of EKOs.

However, due to the fact that many datasets are sharing the scale and that pineappl grids all share the same x-factors, it should be possible to generate a cache of EKOs (for a given theory).

So for instance, if I'm going to run:

pineko theory ekos dataset_n 410000000

Pineko should be able to 1) Read all the ekos already present in the eko folder (in which the eko for dataset_n will be generated) 2) Read the relevant operator cards (no need to parse up all ekos) 3) Find out where some of the operators for dataset_n are already computed and take them directly from there.

The (ideal) next step would be to not save all operators, but just the union of all operators requested in all operators cards.

I'm wondering whether this is a crazy idea or this could be doable. I'm particularly interested in the ideal next step since I'm having storage problems...

alecandido commented 5 months ago

We took considered this many times, and it was part of the initial concept behind EKO.

The only reason why we gave up is the size of the EKOs. If this is not an issue any longer, and we have a lot of storage, that could be an interesting option (storing EKOs is essentially a space-time tradeoff, if you are actually reusing them).

The FK tables case

We discussed for quite some time about compressing all the FK tables EKOs into a single one. Indeed, the terminology used was *small* EKO and *big* EKO for a theory. The small one was the postfit (that is actually computed and there), and the big one would have been the union of all the EKOs used to generate the FK tables. However, since EKOs are squared wrt to the DIS grid (not exactly squared for the double hadronic) the requirement to store them was considered to be absurd (just think about merging all the jets EKOs), and we gave up...

alecandido commented 5 months ago

More specifically, you could also reuse EKO subsets, or just recompute a subset, if some configurations are matching (the theory + evmod + scale variations + ...).

scarlehoff commented 5 months ago

The only reason why we gave up is the size of the EKOs

Why? I was thinking (hoping!) precisely on a way of reducing the amount of EKOs needed since, for instance, most DY datasets are probably sharing them.

felixhekhorn commented 5 months ago

at some point we also considered adding a interpolation in Q2 (or better $\log(Q^2)$ ) ... (i.e. to not compute for every $Q^2 \in \mathbb R$ possible)

alecandido commented 5 months ago

Why? I was thinking (hoping!) precisely on a way of reducing the amount of EKOs needed since, for instance, most DY datasets are probably sharing them.

They usually involve different scales.

The relevant settings are controlled by the theory, so they should not change per-dataset. And if we're computing the grids (when it happens) we can even control the momentum fraction grid (ok, this is not the case of jets and the other Ploughshare grids). But the Q2 scales we do not control*...

*well, there is a limit in which you can control the dynamical scale choice, but there are external recommendations on how to pick it, so it is not something we can lightheartedly use for computing optimization

scarlehoff commented 5 months ago

They usually involve different scales.

Many of the DY datasets just have muF = muR = mZ. That's why I'm thinking the same family of processes might often share the scales. Even when they are dynamic.

(and even more so if we have the same process binned across some variable the scale does not depend on, like some of the 2D distributions)

alecandido commented 5 months ago

However, these datasets are often not problematic, because if your scale does not depend on the bin, you often have a single one per dataset, and those EKOs are small (the usual len(xgrid) ** 2 * len(flavors) ** 2 * size(float), without the Q2 factor). And the computational demand is proportional to the size.

All the big EKOs are coming from having many scales. To the best of my knowledge.

cschwan commented 5 months ago

If I recall correctly a big problem were jet measurements at the LHC where also the xgrid wasn't constant over Q2. That led to the biggest EKOs that I've seen so far. But we could try to implement a hybrid approach in which we have several EKOs:

  1. If the dataset uses our favourite xgrid of 50 points we use a single 'big EKO'
  2. For all the other datasets we don't change anything. We just keep using 'small' (lol) EKOs
  3. In the meantime we should try to migrate datasets from 2 to 1 - essentially implement them in pinefarm ourselves.

So the best of both worlds essentially.

scarlehoff commented 5 months ago

If I recall correctly a big problem were jet measurements at the LHC where also the xgrid wasn't constant over Q2.

This was due to them not being originally pineappl grids, right?

alecandido commented 5 months ago

If I recall correctly a big problem were jet measurements at the LHC where also the xgrid wasn't constant over Q2.

I remember this as well. However, it should not be a big deal: each Q2 value is computed separately in EKO, so sharing the same xgrid on different scales it is only helpful for the common part (up to the matching).

I.e. if each value of Q2 has its own xgrid, it could be up to 3x computation (up-to-bottom evolution + bottom matching + from-bottom evolution - since NNPDF Q0 is in 4 flavors, and ignoring above top). But if that's not the case, the overhead should be small.

This was due to them not being originally pineappl grids, right?

For sure.

cschwan commented 5 months ago

In principle and in practice we also have preferred Q2 values; if a dynamic scale is chosen, the Q2 points of newly-generated grids should always be a subset of 40 fixed values. The only exception to this for new grids comes from datasets where we chose a static scale value. But then there's only one Q2 per dataset/bin.

We could choose not to make a static-scale optimization and then we already know which EKOs are needed: only the ones for known 50 x grid values and the 40 Q2 values.

alecandido commented 5 months ago

With one Q2 value per dataset + 40, and 50 xgrid points, it would be a very reasonable EKO even the "big one" per theory (the FK table EKO, as opposed to the postfit EKO). It'd be certainly sizeable, but reasonable.

However, we still have many wild (imported) grids. Are we planning to recompute them soon? Have they already been recomputed?

scarlehoff commented 5 months ago

For the old theories that ship has sailed of course, but we are recomputing many grids as we are preparing the theory for 4.1

I think only singletop, jets and dijets will not be native pineappl grids. And both jet and dijets should be pineappl-able since they are processes included in nnlojet.

cschwan commented 5 months ago

I think only singletop, jets and dijets will not be native pineappl grids. And both jet and dijets should be pineappl-able since they are processes included in nnlojet.

Using separate EKOs for those seems like a good compromise.

alecandido commented 5 months ago

For the old theories that ship has sailed of course, but we are recomputing many grids as we are preparing the theory for 4.1

Whatever you're doing, the proposal is of course for new theories. The old one could be at most deprecated in favor of new ones (because of known bugs/limitations, and the files could be dropped in the very long term).

I think only singletop, jets and dijets will not be native pineappl grids. And both jet and dijets should be pineappl-able since they are processes included in nnlojet.

How much computation would be required to pinefarm them?