Closed scarlehoff closed 2 years ago
That's a good idea. However, it only works when the 1) charm PDF is equal to the anti-charm PDF and if, at the same time, 2) bottom- and top-quark PDFs are zero at the FK table scale. Since this isn't the case in general, but is the case for NNPDF, the place where we can do this is in Grid::convolute_eko
; let's add @AleCandido and @felixhekhorn to see what they're saying.
In practice I'd give convolute_eko
a few more parameters that allow the specification of these constraints. That metadata of the FK tables should also reflect the constraints, and ideally allow PineAPPL to check the PDF that is used to prevent the user from using sets that don't fulfill them. We should probably discuss more optimization strategies that in general would allow us to speed up convolute_eko
and make the FK tables smaller; I'm changing the title of this Issue to reflect that.
1) charm PDF is equal to the anti-charm PDF
@felixhekhorn had the same worry. I'd say for now we can accept that to be true
2) bottom- and top-quark PDFs are zero at the FK table scale
But this is always true (or always false) for a given fktable, right?
Personally I was thinking of reduced_table()
and reduced_lumi()
methods (or something like that) where the simplification is done by the user (i.e., me) if they (i.e., I) need a smaller fktable. It doesn't really matter to me if the fktable are unnecessarily big on disk or if I need to spend a few more seconds to load them* but I want them to be as small as possible in memory for the fit.
*of course, I'm even happier if their size is reduced also in disk but that's a second order problem for me
But as you said this is a NNPDF thing so let's do as @scarlehoff and I said in person: let's change the .tables()
... (that is at the latest point possible ...)
I agree on not to add any further complication to convolute_eko
, and add a method to the FkTable
object.
However, instead of providing methods for a different representation, I would add a method to get a new FkTable
object by collapsing some rows of the original one. And then you can even dump the new object, already collapsed.
We should probably discuss more optimization strategies that in general would allow us to speed up convolute_eko and make the FK tables smaller; I'm changing the title of this Issue to reflect that.
at some point definetly yes, however let's focus for the moment on the things we need to reach our deadline :upside_down_face:
I would say that the idea of returning a new collapsed object serves well any possible usage:
FkTable
in memoryI would say that the idea of returning a new collapsed object serves well any possible usage:
* if you **don't** care about disk representation and initial effort, you can load whatever you have there, call the method and get the new collapsed `FkTable` in memory * if you **do** care, you do the operation offline, and you dump the result
That's a good idea! We could call it FkTable::optimize(&self, constraints: ...) -> FkTable
.
I agree, at the end of the day an FkTable
is conceptually different from a Grid
, so there is no clash.
That's a good idea! We could call it
FkTable::optimize(constraints: ...) -> FkTable
.
Mmm but this clashes with our idea of an FKTable beeing immutable ...
That's a good idea! We could call it
FkTable::optimize(constraints: ...) -> FkTable
.Mmm but this clashes with our idea of an FKTable beeing immutable ...
Not really, because it returns another FkTable
and doesn't modify self
.
but will this work if the thing is not clonable? (and we want this)
That's a good question...
@AleCandido @felixhekhorn Important question: are the grids (opposed to the grids convoluted with PDFs) for sigma, T24, T35, and for V, V15, V24, V35 the same? In that case we can easily implement this optimization at the level of redefining the luminosity functions.
No (if I understand correctly) because some processes, e.g. CC DIS or CC DY, can resolve difference between b and bbar, e.g., and hence V24 and T24 are unique
@cschwan is this available? i..e, can I do .optimize() to get
sigma = T24 = T35 V = v15 = v24 = v35
? Or maybe this should be done by pineko (since it depends on the final scale)? @felixhekhorn
@scarlehoff The short answer is no. How important is that for the fit in terms of a timescale?
I'm a bit hesitant to implement the optimization because I'm still wondering what the proper way to do it would be. Isn't this technically a different basis?
In any case I'd like make the assumptions explicit somewhere so that the user can't use a PDFs where these assumptions are not fulfilled.
@scarlehoff The short answer is no. How important is that for the fit in terms of the timescale?
When we do the fit with all the tables the memory is going to grow quite a bit and thus the fits will take longer. It's not a disaster but it's a big QoL feature.'
Isn't this technically a different basis?
Yes and no. It's just an optimization over the evolution basis at a given scale.
In any case I'd like make the assumptions explicit somewhere so that the user can't use a PDFs where these assumptions are not fulfilled.
One possibility is that this is done at loading time, so that one can ask for the fktable table or the optimized_fktable where some of the values are summed (for instance T24 and 35 get absorbed into singlet) Saving them in an already optimized format makes it easier in the storage but I don't care that much about that part.
When we do the fit with all the tables the memory is going to grow quite a bit and thus the fits will take longer. It's not a disaster but it's a big QoL feature.'
I see, that's why I've added this to the v0.5.0 milestone.
Would the interface to the fit understand if I'd remove T24, T35, V15, V24 and V35 from the list of luminosities or should I keep them in there (their contribution would be zero)?
@felixhekhorn @AleCandido I'd suggest to add one parameter to FkTable::optimize
, because somewhere higher up the theory must tell us which assumptions we need (for instance whether we distinguish between charm and anti-charm as mentioned in the PC, or not).
Would the interface to the fit understand if I'd remove T24, T35, V15, V24 and V35 from the list of luminosities or should I keep them in there (their contribution would be zero)?
Since this is an optimization feature I guess all 0s should be removed for maximum optimization.
In any case, we don't want this to be a breaking change: we want the fit to work without, but it would be quite bad if it's not going to still work with the 0s.
Thus, I propose to release v0.5.0, and start a v0.5.1 milestone
@felixhekhorn @AleCandido I'd suggest to add one parameter to
FkTable::optimize
, because somewhere higher up the theory must tell us which assumptions we need (for instance whether we distinguish between charm and anti-charm as mentioned in the PC, or not).
Since FkTable::optimize
doesn't exist yet, contrary to what I said above, we can add FkTable::optimize
which accepts one parameter that specifies the assumptions. This wouldn't be a breaking change, and then we can add it in v0.5.1, as @AleCandido said.
Commit cf02e33dc65c53343c0276fd411094c6fe2bc95e should implement all interface-visible changes to do exactly what @scarlehoff proposed.
As promised there's one new method FkTable::optimize
, which requires an enum FkAssumptions
; please have a look at the documentation of it,
and let me know if I can improve it. It should enable a general version of the optimization that @scarlehoff proposed in the first comment, if we want to fit bottom = anti-bottom, for instance. The optimization should work for FK tables in the flavour and evolution-basis.
What we need next is 1) the corresponding method in the Python interface and 2) probably a slightly modified theory that specifies the assumption and that calls this method in Pineko/fktable. @AleCandido @felixhekhorn could you please have a look at this?
What's still missing is the actual optimization, for the time being only the luminosity function is rewritten and therefore one gets repeated entries. I'm working on that now.
I'll update immediately Python interface, I guess there are only two things to bind:
FkTable.optimize()
enum FkAssumptions
, that is needed as an inputIt should not be hard.
In 834a041397842c16cedac326e5012632a0e81dab I added the bindings for the method.
I had to make the FkAssumptions
Copy
(it's just a plain C-like enum, so it should be fine). In order to be Copy
, it had to be Clone
(according to cargo
, and so I guess rustc
).
At this point it would have been worth a dedicated PR ;-)
In c63b0ea0b2355b84b8d16b8d767e34513bb830c1 it becomes actually possible to generate FkAssumptions
, choosing them by string.
Since this mapping is encoded twice: https://github.com/N3PDF/pineappl/blob/c63b0ea0b2355b84b8d16b8d767e34513bb830c1/pineappl/src/fk_table.rs#L323-L331 https://github.com/N3PDF/pineappl/blob/c63b0ea0b2355b84b8d16b8d767e34513bb830c1/pineappl_py/src/fk_table.rs#L34-L42
I would suggest to split it in an HashMap
and reuse it.
@AleCandido I agree, I'll implement the trait Display
and TryFrom
(?) for FkAssumptions
, for which I'll probably also make the strings look like exactly like the enum values. We should probably hold off feeding strings into the Python interface until this is final.
Ok.
At the moment, it is what it is, if you wish to remove it you can do it, but since it's not part of a release I believe we can keep it, and use creation from Python just to experiment. The moment you have a better candidate, I'll make sure to update.
Implemented in commit 4427e9ed22c50cb4b8eacbc351c60aee7c01320f.
Commit 19aa0a5004a906ab5a6ad1a10a8a1fded378d640 should implement everything we need, and adds the option --fk-table=Nf4Sym
, for instance, to pineappl optimize
. This will allow us to optimize the already generated FK tables. I still have to test everything, though.
Well done :D
Is there anything left that is needed (but testing) @scarlehoff?
To first approximation only testing.
First tests look very promising:
mkdir ../appl_subfktables_opt && for i in *.pineappl.lz4; do pineappl optimize --fk-table=Nf4Sym $i ../appl_subfktables_opt/$i; done
in /media/FK/fktables/data/appl_subgrids
on dom to optimize the tables - that's surprisingly quickappl_subfktables
and appl_subfktables_opt
is quite nice: 1.3 GB vs. 643 MB, that's a bit less than half the sizefor i in *.pineappl.lz4; do pineappl --silence-lhapdf diff --ignore-lumis $i ../appl_subfktables_opt/$i NNPDF40_nnlo_as_01180; done
Finally it seems to do what it promises:
pineappl obl --lumis appl_subfktables_opt/200-ATLASDY2D8TEV-aMCfast_obs_0.pineappl.lz4
doesn't show evolution IDs higher than 208
(V8) and 115
(T15). Prior to optimization this grid has 91 luminosities, after optimization there are only 45 luminosities which is plausible if you think about the size reduction.
Applying pineappl optimize
to the PineAPPL grids seems to be non-modifying as expected.
As far as PineAPPL is concerned this Issue can be closed, if you agree. Don't forget to call FkTable::optimize
in pineko.
In order to use this in pineko
, we'll need a new release of pineappl
.
If you don't have anything more to add close in the future @cschwan, I would just release 0.5.1
.
I have one question, the optimize
only removes 1) flavours that are 0 and 2) reduces flavours by summing them into the same "histogram-bin" (depending on the theory) but the xgrid
remains the same for all fktables? right @cschwan ?
That's correct, @scarlehoff!
For instance, at the initial scale we have that
sigma = T24 = T35 V = v15 = v24 = v35
One thing that the NNPDF fktables do is to say "ok, since these are equal I'll just sum everything in, say, V, and empty the other rows" so one gets "V+V15+V24+V35" but the luminosity function is just V.