Remove `operator_cards` (and _template.yaml)

scarlehoff commented 4 weeks ago

As the title says, I would like to remove the need for the operator_card (and _template.yaml) in pineko. Below the reasoning. I'd like some feedback (@felixhekhorn @giacomomagni ) on whether 1) This is a good idea (maybe there's a reason why we need those I'm not considering) 2) My ultimate goal is doable

Reasoning:

Given a theory card, a grid and a _template.yaml, I can reproduce any operator_card.yaml. Therefore, the operator_card as an standalone object does not add any extra information. In addition, both the EKO and the FKTable contain a copy of the operator_card so reproducibility is ensured already (they used not to be available in the FK). Therefore, I would like to remove the need for the opcard command and, instead, the ekos command will generate it and use it on the go.

_template.yaml:

Then, I would also like to remove the _template.yaml and either hard-code these options in pineko or add them as command-line argument (here I need some feedback as well @Radonirinaunimi @giacomomagni ) For instance, in my experience the only thing I ever change in _template.yaml is the number of cores and usually I need to then set it manually (and differently) for different datasets, which makes the autogeneration of the operator card a bit useless. There might be other options that are necessary and those should be added (if they are not many) to the command.

Ultimate noble goal:

The reason I want to get rid of the operator cards is that I would like to generate eko with a "fake" operator card to get a subset of ekos in one go. This can speed up things considerably because many datasets actually share the EKOs. My idea is that when one calls pineko with a list of datasets and do pineko theory ekos , pineko will actually create a big eko with all the necessary operators, and then pineko theory fks will just look into that eko to see whether the necessary operators are available.

In a first step this will be done in a lazy way, so at the end of ekos the big eko will be separated into the small eko-per-grid so that the rest of the code and scripts work just the same. Of course this first implementation requires some care from the user (e.g., running all DY or TTB at once will be very effective, mixing jets and NMC will explode in your face) but later on we can even optimize that.

This issue is obviously connected with both #173 and with #201

felixhekhorn commented 4 weeks ago

Mmm I'm not convinced of this

op card generation

the reason on why there is an explicit operator card generation step is precisely to disentangle the grids and eko. You need the grids for the op card generation, but then no longer - see #14 . I'd say giving up on that is a big blocker on a more flexible eko generation (because you need to always have GB of grid files everywhere). What would be possible (now) is to directly generate a "void" EKO, i.e. an EKO which has only recipes, but no operators - this is saving you one (yaml) file (which is then stored explicitly inside the EKO object).

give up on `_template.yaml`

This is explicitly reverting #47 #67 for which I'm sure there were good reasons. Passing the xgrid on a command line would be difficult

alternative

I'd suggest to push for #173 instead

scarlehoff commented 4 weeks ago

(because you need to always have GB of grid files everywhere)

Fine. I would still like to remove the need for the _template though...

This is explicitly reverting https://github.com/NNPDF/pineko/issues/47 https://github.com/NNPDF/pineko/pull/67 for which I'm sure there were good reasons. Passing the xgrid on a command line would be difficult

This is why I'm asking, because I'm not entirely sure there are good reasons anymore. I'm not sure about the xgrid but if I look at the top of the template:

configs:
  ev_op_max_order: 
    - 10
    - 0
  interpolation_polynomial_degree: 4
  interpolation_is_log: true
  scvar_method: None
  inversion_method: expanded
  n_integration_cores: 64
  polarized: false
  time_like: false
mugrid:
  - - 50.0
    - 5
 ...
debug:

polarized and time_like should not be part of the template, as they should be defined by the grid. For n_integration_cores I already mentioned the problem. All other flags I think could be fixed.

Then the xgrid... should that not be defined by the grid? I though we were defining it with the grid...

Edit: I want to go for #173, but I would like to remove as many intermediate step as possible.

felixhekhorn commented 4 weeks ago

Then the xgrid... should that not be defined by the grid? I though we were defining it with the grid...

nop - xgrid (formerly known as interpolation_xgrid e.g. in #47 ) defines the grid on which the operator is computed and starting from v0.15 the grid with which it is stored on disk. Afterwards it can be adjusted to whatever the user needs, e.g. pineko needs to match the xgrid in a given partonic grid - but this can be done (and was always done) a posteriori. xgrid defines the precision (and so the computing time scale)

This is why I'm asking, because I'm not entirely sure there are good reasons anymore.

That I don't know of course - I vaguely remember that the low precision theories were the reason

scarlehoff commented 4 weeks ago

I vaguely remember that the low precision theories were the reason

In fact I thought we could no longer do low precision theories XD

giacomomagni commented 4 weeks ago

I'm happy to start dropping the template_card in favour of template_grid as you say all the other informations are repeated. We can indeed pass the number of cores from command line. Also the command to generate operator cards (pineko theory opcards) could be merged with the the eko command (pineko theory ekos), no? (we already do a similar trick in for the fonll CLI)

scarlehoff commented 4 weeks ago

Not if as @felixhekhorn says we want to keep the possibility of doing the ekos without the grids being available (I guess this can't be done in the case of nfonll?)

giacomomagni commented 4 weeks ago

Not if as @felixhekhorn says we want to keep the possibility of doing the ekos without the grids being available (I guess this can't be done in the case of nfonll?)

Wait I was referring to an interpolation grid, not to the real pineappl grid...

scarlehoff commented 4 weeks ago

apologies, I was responding to

Also the command to generate operator cards (pineko theory opcards) could be merged with the the eko command (pineko theory ekos), no? (we already do a similar trick in for the fonll CLI)

wrt the template.yaml I'm happy that you confirm the information is redundant, I'll remove it after the alpha_s business is done

NNPDF / pineko