NNPDF / pineappl

PineAPPL is not an extension of APPLgrid
https://nnpdf.github.io/pineappl/
GNU General Public License v3.0
12 stars 3 forks source link

Use linear algebra routine for `Grid::convolute_eko` #152

Closed cschwan closed 1 year ago

cschwan commented 2 years ago

Grid::convolute_eko often takes quite long. We should speed it up using a linear algebra library and make the necessary rewrites (see also https://github.com/N3PDF/pineappl/pull/103#issue-1108216857).

alecandido commented 2 years ago

Actually, what do you mean by "linear algebra library"? Something like ndarray/nalgebra, or did you have something else in mind?

cschwan commented 2 years ago

Basically what I said in 2. of https://github.com/N3PDF/pineappl/pull/103#issue-1108216857.

alecandido commented 2 years ago

The "fast linear algebra libraries" were generic even in that comment ^^

cschwan commented 2 years ago

I see, I didn't have any specific library in mind.

cschwan commented 1 year ago

@andreab1997 @AleCandido @felixhekhorn I would like to work on this Issue in the coming weeks, and probably also on #122, which is connected. To do that I need to know a few things:

  1. what's the dataset with the smallest EKO and how big is it?
  2. what's the dataset with the largest EKO and how big is it (I suppose it's dijet/single-inclusive jets)?
  3. which EKO should I use to optimize the evolution?

I don't need to know exact numbers, but a rough estimate would be fine. I'd then try to generate an EKO, use the one from 1) to develop and test and the one from 3) to optimize. 2) should be an ideal candidate to test the final product and see how well it all scales. Let me know if you have more ideas!

felixhekhorn commented 1 year ago
  • what's the dataset with the smallest EKO and how big is it?

I guess that should be a positivity grid, since it has a single Q2 and it has even a small number of bins (the number of bins is irrelevant for EKO, but not for convolute_eko). The single example I have is 400kb - maybe @andreab1997 can confirm?

  • what's the dataset with the largest EKO and how big is it (I suppose it's dijet/single-inclusive jets)?

yes, I guess jets are the biggest, since they involve a huge number of Q2s. There a single bin has 1.8-3.1MB and CMS_1JET_8TEV has 239 bins. Remember that back then I also called convolute_eko bin by bin and I'm not sure whether we can circumvent that.

  • which EKO should I use to optimize the evolution?

boh - that is a difficult question. In a sense, I guess all EKOs are the same - meaning they have the same mathematical structure. The content may make the numerics more tricky (meaning cancellation), but this is something you can not prevent since this is physics. I think any DIS grid should be fine ... say HERA CC ... what do you think @AleCandido @andreab1997 ?

Of course the trivial overall statement is that any EKO scales quadratically with the number of x points, so you can always shrink or blow up.

andreab1997 commented 1 year ago
  • what's the dataset with the smallest EKO and how big is it?

I guess that should be a positivity grid, since it has a single Q2 and it has even a small number of bins (the number of bins is irrelevant for EKO, but not for convolute_eko). The single example I have is 400kb - maybe @andreab1997 can confirm?

Yes, I believe that the positivity should be the smallest too.

  • what's the dataset with the largest EKO and how big is it (I suppose it's dijet/single-inclusive jets)?

yes, I guess jets are the biggest, since they involve a huge number of Q2s. There a single bin has 1.8-3.1MB and CMS_1JET_8TEV has 239 bins. Remember that back then I also called convolute_eko bin by bin and I'm not sure whether we can circumvent that.

  • which EKO should I use to optimize the evolution?

boh - that is a difficult question. In a sense, I guess all EKOs are the same - meaning they have the same mathematical structure. The content may make the numerics more tricky (meaning cancellation), but this is something you can not prevent since this is physics. I think any DIS grid should be fine ... say HERA CC ... what do you think @AleCandido @andreab1997 ?

I usually use DIS grids to test so I agree that HERA would be a sensible choice.

Of course the trivial overall statement is that any EKO scales quadratically with the number of x points, so you can always shrink or blow up.

cschwan commented 1 year ago

OK great, then I'll start using the positivity grid.

alecandido commented 1 year ago
  • what's the dataset with the smallest EKO and how big is it?

I guess that should be a positivity grid, since it has a single Q2 and it has even a small number of bins (the number of bins is irrelevant for EKO, but not for convolute_eko). The single example I have is 400kb - maybe @andreab1997 can confirm?

Another relevant small example (with DY kinematics) should be the total inclusive cross section, with a single bin (but I don't remember if it has a dynamic scale or not.

E.g. something like ATLASTTBARTOT7TEV or CMSTTBARTOT7TEV, and similar ones