NNPDF / pineappl

PineAPPL is not an extension of APPLgrid
https://nnpdf.github.io/pineappl/
GNU General Public License v3.0
12 stars 3 forks source link

Allow to easily generate order masks #73

Closed scarlehoff closed 2 years ago

scarlehoff commented 2 years ago

In order to mask orders in the python interface I need to pass five different booleans

    qcd_result = pine.convolute(xf, xf, a_s,  [True, True, False, False, False])

What is the combination for NLO QCD and NLO EW? (so, LO as and LO a) The closest combination for the he qcd_result I get is [True, False, False, False, True] but would prefer to have the exact definition.

(wasn't sure whether this is a python or internal pineapl thing, so wasn't sure whether to direct the question to @cschwan or @felixhekhorn @AleCandido)

cschwan commented 2 years ago

You need to match the ordering returned by grid.orders(). This returns a list of PyOrder objects, which you can query with as_tuple(), which in turn returns the exponents. For Drell-Yan you should get the following tuples:

(0, 2, 0, 0)
(1, 2, 0, 0)
(1, 2, 1, 0)
(1, 2, 0, 1)
(0, 3, 0, 0)
(0, 3, 1, 0)
(0, 3, 0, 1)

The ordering can be different, and some tuples may be missing. The first tuple is the alpha^2 contribution (that's the LO), the next three are the NLO QCD (one power in alphas^1), and the last three are the NLO EW (alpha^3). If you don't need scale variations you can ignore (mask out) all orders that have non-zero power in logxir or logxif.

To calculate NLO QCD, you would therefore have to construct the following mask (no scale variations):

[True, True, False, False, False, False, False]

NLO EW would be

[True, False, False, False, True, False, False]

Note again that the ordering of the tuples given above are arbitrary, and that there can be many NLO contributions depending on the process, jets for instance can have four different NLO (and three LO), each with a renormalization and factorization scale variation grid, yielding in total 12 NLO grids (and three LO grids).

scarlehoff commented 2 years ago

I guess for different DY I would get different orders as well? For instance, for one of the DY_13TeV I get:

[orde.as_tuple() for orde in pine.orders()]

[(0, 2, 0, 0),
 (0, 3, 0, 0),
 (0, 3, 1, 0),
 (0, 3, 0, 1),
 (1, 2, 0, 0),
 (1, 2, 1, 0),
 (1, 2, 0, 1)]

So I guess in this case I need indeed [True, False, False, False, True] for QCD and [True, True, ... False...] for EW. Ok. If this kind of agreement (NLO QCD - NLO QCD) is the one expected between NNPDF fktables and pineappl (It's better than 1% sometimes even 1/1000 everywhere) then done :tada:

60728.104880 | 60522.719962
81917.298892 | 81734.695100
100762.631824 | 100868.942643
116706.650959 | 117000.740717
129694.870399 | 129659.852907
138809.684960 | 139312.762054
144109.582766 | 144696.949294
143703.496201 | 143953.700108
134733.578628 | 134808.297653
117564.704629 | 117600.032637
97699.892159 | 98277.907191
77189.144055 | 77139.725131
57170.074805 | 57260.858459
38933.969678 | 38963.564297
17959.886482 | 17931.200391
cschwan commented 2 years ago
60728.104880 | 60522.719962
81917.298892 | 81734.695100
100762.631824 | 100868.942643
116706.650959 | 117000.740717
129694.870399 | 129659.852907
138809.684960 | 139312.762054
144109.582766 | 144696.949294
143703.496201 | 143953.700108
134733.578628 | 134808.297653
117564.704629 | 117600.032637
97699.892159 | 98277.907191
77189.144055 | 77139.725131
57170.074805 | 57260.858459
38933.969678 | 38963.564297
17959.886482 | 17931.200391

This looks pretty good. In general, we expect differences due to

alecandido commented 2 years ago

Notice that pineappl info --get runcard for the time being will be available only for the madgraph generated pineapplgrids. The one produced by yadism are lacking it (I will concatenate the two runcards and put there at some point).

alecandido commented 2 years ago

Since #74 is not going to do it maybe we would still like to implement:

cschwan commented 2 years ago

Since #74 is not going to do it maybe we would still like to implement:

* [ ]  a function to generate order masks from something similar to "NLO QCD" or "NLO EW"

I agree, I've changed the title to reflect this.

cschwan commented 2 years ago

Commit b824458e27c2e17a1dff29e39b116688e6bb890a adds two new functions, which allow the selection of NLO EW and QCD. I've named them Order::mask_highest_alphas and Order::mask_highest_alpha, which IMO describe a bit better what they do. If the grid contains all orders for the specific process at LO, NLO, NNLO, ... then mask_highest_alphas will return the mask for LO, NLO, NNLO, ... QCD, and the other function for EW.

@AleCandido @felixhekhorn @scarlehoff

scarlehoff commented 2 years ago

I'm afraid it is not super clear. In particular you said

mask_highest_alphas will return the mask for LO, NLO, NNLO

Do you mean it will return a mask for each of the orders or one containing LO + NLO + NNLO?

It might be silly but I'm confused about "mask_highest" returning "orders". Should I expect one mask or several? Or one mask with all orders? What do I have to give as the input?

The example didn't clarify it for me I'm afraid.

felixhekhorn commented 2 years ago
alecandido commented 2 years ago

Moreover: yesterday we said that might be useful to mask lower orders as well.

I would do instead a single function that accepts two ranges, first for QCD and second for QED, inside an option to make it None for unconstrained.

cschwan commented 2 years ago

Commit 49efe64bfc9f77f9461046f9d020934d5626f79f should implement most suggestions, please let me know what you think (the documentation might still be a bit unclear).

alecandido commented 2 years ago

I believe now is much better, and even the documentation is pretty good.

I would suggest to shift the orders:

alecandido commented 2 years ago

Honestly my original idea was something like this:

fn get_bounds<T: RangeBounds<u32>>(r: T) -> (u32, Option<u32>) {
    let start = match r.start_bound() {
        Bound::Unbounded => 0,
        Bound::Included(x) => *x,
        _ => unreachable!(),
    };

    let end = match r.end_bound() {
        Bound::Unbounded => None,
        Bound::Included(x) => Some(*x),
        _ => unreachable!(),
    };

    (start, end)
}

pub fn create_mask<T: RangeBounds<u32>>(orders: &[Order], as_range: T, al_range: T) -> Vec<bool> {
    let (min_as, max_as) = get_bounds(as_range);
    let (min_al, max_al) = get_bounds(al_range);

    ...
}

In order to be able to select even NLO QED only, e.g using:

create_mask(orders, .., 1..=1)

But maybe it's too fancy...


Just to elaborate: this works well with the simplified implementation of above, because you could filter by:

if let Some(max) == max_as { 
  min_as <= delta_as <= max
} else { 
  min_as <= delta_as
} && if let Some(max) == max_al { 
  min_al <= delta_al <= max
} else { 
  min_al <= delta_al
}
alecandido commented 2 years ago

Ok, stupid me: it was a complicate solution to an easy problem.

The actual function would be much easier with the same signature:

pub fn create_mask<T: RangeBounds<u32>>(orders: &[Order], as_range: T, al_range: T) -> Vec<bool> {
    ...
    .map(|Order { alphas, alpha, .. }| {
        let delta_as = alphas - alphas_lo;
        let delta_al = alpha - alpha_lo;

        as_range.contains(delta_as) && al_range.contains(delta_al)
    }
    ...
}
cschwan commented 2 years ago

@AleCandido (0, 1), (1, 0) and (1, 1) only produce the same output in the case when there is only a single LO, but they work differently in the case when there are multiple, see commit 04da81f7ee923ad963e521a7a3d995ea76616378. Therefore I can't remove (0, 0).

cschwan commented 2 years ago

Ok, stupid me: it was a complicate solution to an easy problem.

The actual function would be much easier with the same signature:

pub fn create_mask<T: RangeBounds<u32>>(orders: &[Order], as_range: T, al_range: T) -> Vec<bool> {
    ...
    .map(|Order { alphas, alpha, .. }| {
        let delta_as = alphas - alphas_lo;
        let delta_al = alpha - alpha_lo;

        as_range.contains(delta_as) && al_range.contains(delta_al)
    }
    ...
}

That's a different function that doesn't let you specify NNLO QCD for top-pair production though; you need to treat the LOs differently from the NLOs and the NNLOs to do that.

scarlehoff commented 2 years ago

Commit 49efe64 should implement most suggestions, please let me know what you think (the documentation might still be a bit unclear).

I think the documentation + example / test is clear enough now. Thanks!

The only thing confusing for me is that you have to pass an array of orders to the function (naively the array would be created automagically if you do grid_object.create_mask(a=2, as=1) but I guess the proper use is more like create_mask(grid_object.orders(), a=2, as=1) which is not bad either.

cschwan commented 2 years ago

The only thing confusing for me is that you have to pass an array of orders to the function (naively the array would be created automagically if you do grid_object.create_mask(a=2, as=1) but I guess the proper use is more like create_mask(grid_object.orders(), a=2, as=1) which is not bad either.

I chose it to be a class method of Order, because in that case I can use it even without a Grid instance, which is important for doctesting. And manually passing grid.orders() isn't too difficult :smile: .

alecandido commented 2 years ago

@AleCandido (0, 1), (1, 0) and (1, 1) only produce the same output in the case when there is only a single LO, but they work differently in the case when there are multiple, see commit 04da81f. Therefore I can't remove (0, 0).

Ok, I see. I didn't consider the case of different LOs.

That's a different function that doesn't let you specify NNLO QCD for top-pair production though; you need to treat the LOs differently from the NLOs and the NNLOs to do that.

Here I didn't properly understand, apart from the implementation the goal was just to be able to specify a minimum value as well, but I guess there is a counterexample I'm missing again...

alecandido commented 2 years ago

The only thing confusing for me is that you have to pass an array of orders to the function (naively the array would be created automagically if you do grid_object.create_mask(a=2, as=1) but I guess the proper use is more like create_mask(grid_object.orders(), a=2, as=1) which is not bad either.

For this I believe we could even add a tiny function in the python package: we would like the actual bindings (pyo3) to be as thin as possible and not to add anything that is not in the crate, but I believe that it's fine instead to add convenience wrappers in the very last python layer.

alecandido commented 2 years ago

Stupid observation: https://github.com/N3PDF/pineappl/blob/04da81f7ee923ad963e521a7a3d995ea76616378/pineappl/src/grid.rs#L165 wouldn't be more idiomatic to use Self instead of Order here?

cschwan commented 2 years ago

Stupid observation:

https://github.com/N3PDF/pineappl/blob/04da81f7ee923ad963e521a7a3d995ea76616378/pineappl/src/grid.rs#L165

wouldn't be more idiomatic to use Self instead of Order here?

No, that's a good point that also clippy complains about; fixed in commit 64bc7e268de40d519c61a419b45aa9eac91b7bab.

cschwan commented 2 years ago

@AleCandido we need a Python wrapper for this function, then we can close this Issue, I believe.

alecandido commented 2 years ago

Okay, I'll do immediately

alecandido commented 2 years ago

Ok, it has not been straightforward, but it's finally working, see b8d834919fb037dcf9d53a2e6270356395ce6331.

alecandido commented 2 years ago

The alternative would be to make it a static method of Order in python, see https://pyo3.rs/v0.15.1/class.html#static-methods.

What do you prefer?

P.S.: a static method is a method that does not get self, so is somewhat similar to an associated function (even though they are more similar to class methods, that in python are mostly used as alternative constructors)

cschwan commented 2 years ago

I'd prefer static method because that's how it's done also in Rust!

alecandido commented 2 years ago

Okay, I'll do it in a moment

alecandido commented 2 years ago

Done in 388536618a767a89917fa0c8b934c385b902ebda

felixhekhorn commented 2 years ago

@AleCandido could you please make this more consistent? i.e. bind the content inside Python

order_mask = pineappl.grid.PyOrder.create_mask([e._raw for e in pine.orders()], max_as=2, max_al=0)

... and I'm still in favour of having directly a method on grid ;-)

cschwan commented 2 years ago

This looks a bit strange indeed. Why does PyOrder::create_mask accept orders as Vec<PyRef<PyOrder>>? Is the PyRef really needed?

alecandido commented 2 years ago

I'm currently checking it, nevertheless:

P.S.: at the moment yadism runner in runcards is broken D:

felixhekhorn commented 2 years ago

bind the content inside Python

cschwan commented 2 years ago

There shouldn't be any use of Order, as far as the interfaces are concerned, even right now; they should always ask for PyOrder if I'm not mistaken.

alecandido commented 2 years ago

I added the conversion between Order and PyOrder in b4675aca3ed68c66f638fcb55b764431dee0b2e7

Note: the idea is PyOrder to be the internal object, while Order should be the one exposed to the user

alecandido commented 2 years ago

Do you still want even the bindings as a Grid method? (of course it costs me nothing, I just have to write the very same call you need replacing pine with self...)

cschwan commented 2 years ago

I added the conversion between Order and PyOrder in b4675ac

Note: the idea is PyOrder to be the internal object, while Order should be the one exposed to the user

Sorry, I'd confused Order in Rust with Order in Python.

cschwan commented 2 years ago

Do you still want even the bindings as a Grid method? (of course it costs me nothing, I just have to write the very same call you need replacing pine with self...)

In Rust I'd definitely don't want it, because I'd like to avoid having two different ways of solving a problem, especially if the number of lines doesn't differ between the two. In Python I don't know how to decide that.

felixhekhorn commented 2 years ago

In Rust I'd definitely don't want it, because I'd like to avoid having two different ways of solving a problem, especially if the number of lines doesn't differ between the two. In Python I don't know how to decide that.

I'm not happy but I can accept ;-)

PS: I would say there is no problem in adding syntactical sugar, as long as you use the same implementation - which of course we would ... PPS: keep in mind that with 80 characters by line "same number of lines" is a challenge ;-)

alecandido commented 2 years ago

In Rust I'd definitely don't want it, because I'd like to avoid having two different ways of solving a problem, especially if the number of lines doesn't differ between the two. In Python I don't know how to decide that.

I can tell you the pythonic way: the same, no repetition.

python -c "import this" | grep "way"
alecandido commented 2 years ago

Still, I can accept syntactic sugar. And it's not a problem from the point of view of bindings: we imposed that in the pyo3 part there should be nothing more than what is in main pineappl crate, but in the pure python interface for me is perfectly fine to do whatever we prefer.

cschwan commented 2 years ago

PPS: keep in mind that with 80 characters by line "same number of lines" is a challenge ;-)

To quote someone else answering to me basically asking the same question:

Ditch the 80 character limit, this isn't the 80s anymore.

:wink: