NNPDF / pineappl

PineAPPL is not an extension of APPLgrid
https://nnpdf.github.io/pineappl/
GNU General Public License v3.0
13 stars 3 forks source link

Fix `Grid::convolute` caching strategy #66

Closed cschwan closed 3 years ago

cschwan commented 3 years ago

Right now Grid::convolute always has two_caches = true (I've discovered this while running the code in rust-gdb),

https://github.com/N3PDF/pineappl/blob/6ccb9a49fa9d3baa06f9958fd1db6f9286983449/pineappl/src/grid.rs#L343

which misses a few optimizations whenever the two initial-state hadron are the same.

cschwan commented 3 years ago

Plan:

cschwan commented 3 years ago

Here a benchmark, using the grids from https://github.com/NNPDF/pineapplgrids/commit/696024720e03b61ec864619df2f527c35637deb3, running the following command on a computer with an Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz:

time for i in *.pineappl.lz4; do pineappl --silence-lhapdf convolute $i NNPDF40_nnlo_as_01180; done > output
  1. Using pineappl from commit 69110553863d7dbf385e04d5b9f85c8936b17d1f takes 13 seconds, and this suffers from the problem that both initial states have a separate cache, meaning the caching is imperfect. However, due to symmetrization it should be almost perfect in practice.
  2. Using pineappl from commit c6c5a0bb6f45480b3c4213fcbd84b73b04ba228d, with convolute2 instead of convolute in the helper function it takes 73 seconds for the same command; this is because convolute doesn't use caching at all at this stage.
cschwan commented 3 years ago

A similar benchmark with pdf_uncertainty instead of convolute takes 1m37s in the first case and 4m9s in the second case; this subcommand uses 8 cores. This just goes to show that PDF caching is still very much needed.

cschwan commented 3 years ago

More work done in commits f2d2e0e4ee82316bb77f99c90aa740a3c31eb714, 5b81a3c9c69ff1597d8725751517e619990e6d16, 66b38fc6aa74a731ca954e8e43940b34fc15be6e and 4c931effc276ccdad99600678b0538d9a74a6bfc.

cschwan commented 3 years ago

Commit aa3c4b7ab88122c8f1aeb5e77912862369c81b09 implements the actual caching; with it convolute and convolute2 are basically equally fast.

cschwan commented 3 years ago

The new cache is now activated in commit cedd3485d27f66b32dc8963b40ab212f048eb401 for the CLI.

cschwan commented 3 years ago

The only remaining item is to convert Grid::convolute_subgrid.

cschwan commented 3 years ago

Commit 4dbedc63826d7709f833829fe834bbdd74dc7dda adds the remaining changes.