NNPDF / nnpdf

An open-source machine learning framework for global analyses of parton distributions.
https://docs.nnpdf.science/
GNU General Public License v3.0
28 stars 6 forks source link

datasets with asymmetric errors - multiplicative uncertainties don't match additive #903

Closed wilsonmr closed 3 years ago

wilsonmr commented 5 years ago

Commondata files are layed out as per https://github.com/NNPDF/nnpdf/blob/master/doc/data/data_layout.pdf

It seems to me that we should have

100*(columns_additive/central_values) == columns_multiplicative

using https://github.com/NNPDF/nnpdf/pull/476 we can load data as a pandas dataframe and check this very easily and I find the following datasets to have some kind of conflict:

DATA_ATLASTOPDIFF8TEVTPTNORM.dat
DATA_ATLAS_SINGLETOP_TCH_DIFF_8TEV_TBAR_RAP_NORM.dat
DATA_ATLASTOPDIFF8TEVTTMNORM_SF.dat
DATA_ATLAS_SINGLETOP_TCH_DIFF_8TEV_TBAR_PT_NORM.dat
DATA_CMSJETS11_SF.dat
DATA_CDFR2KT_SF.dat
DATA_ATLAS_SINGLETOP_TCH_DIFF_7TEV_TBAR_RAP_NORM.dat
DATA_CMSJETS11.dat
DATA_ATLAS1JET11.dat
DATA_LHCBZEE2FB.dat
DATA_ATLASTOPDIFF8TEVTTMNORM.dat
DATA_ATLASPHT15_SF.dat
DATA_ATLAS_SINGLETOP_TCH_DIFF_8TEV_T_RAP_NORM.dat
DATA_ATLAS_SINGLETOP_TCH_DIFF_8TEV_T_RAP.dat
DATA_CMSTTBARTOT.dat
DATA_D0R2CON.dat
DATA_ATLAS_SINGLETOP_TCH_DIFF_7TEV_T_PT_NORM.dat
DATA_ATLASTOPDIFF8TEVTRAPNORM.dat
DATA_ATLAS_SINGLETOP_TCH_DIFF_8TEV_TBAR_RAP.dat
DATA_ATLAS_SINGLETOP_TCH_DIFF_8TEV_TBAR_PT.dat
DATA_ATLAS_SINGLETOP_TCH_DIFF_8TEV_T_PT_NORM.dat
DATA_ATLAS_SINGLETOP_TCH_DIFF_7TEV_TBAR_PT.dat
DATA_ATLAS_SINGLETOP_TCH_DIFF_7TEV_TBAR_PT_NORM.dat
DATA_ATLAS_SINGLETOP_TCH_DIFF_7TEV_TBAR_RAP.dat
DATA_D0ZRAP_SF.dat
DATA_ATLASPHT15.dat
DATA_D0ZRAP.dat
DATA_LHCBW36PB.dat
DATA_ATLASTOPDIFF8TEVTTRAPNORM_SF.dat
DATA_ATLAS_SINGLETOP_TCH_DIFF_7TEV_T_PT.dat
DATA_ATLASPHT12.dat
DATA_CMSTTBARTOT_SF.dat
DATA_ATLAS_SINGLETOP_TCH_DIFF_7TEV_T_RAP_NORM.dat
DATA_CDFR2KT.dat
DATA_ATLASTOPDIFF8TEVTRAPNORM_SF.dat
DATA_ATLAS_SINGLETOP_TCH_DIFF_7TEV_T_RAP.dat
DATA_ATLAS1JET11_SF.dat
DATA_ATLASTOPDIFF8TEVTTRAPNORM.dat
DATA_CDFZRAP.dat
DATA_ATLAS_SINGLETOP_TCH_DIFF_8TEV_T_PT.dat
DATA_ATLASTOPDIFF8TEVTPTNORM_SF.dat
DATA_ATLASPHT12_SF.dat

For some of these it might be a book keeping task because the uncertainties are additive. However take for example D0ZRAP which performs symmatrisation of errors, we see in

https://github.com/NNPDF/buildmaster/blob/12caaabafd824b6adc013eb205e121e25b0969cc/filters/D0.cc#L77

that the mult values are calculated with respect to the unshifted central values which I believe is incorrect. Some of the above datasets have this issue (and can be easily identified by the _SF suffix). For at least one other I have found a bug in the buildmaster implementation:

https://github.com/NNPDF/buildmaster/blob/12caaabafd824b6adc013eb205e121e25b0969cc/filters/LHCBZEE2FB.cc#L125

should read:

fSys[i][l].add  = syscor[i][l-1];
fSys[i][l].mult =  syscor[i][l-1].add/fData[i]*1e2;

(it is calculating the multiplicative unc. for the wrong systematic - this can be seen by examining the commondata)

I'm happy to go through more of these sets but perhaps I could have some assistance?

wilsonmr commented 5 years ago

@voisey could you see what the problem could be with the datasets you implemented?

wilsonmr commented 5 years ago

Actually I'm pretty sure most of these are the same issue as D0ZRAP where the errors are symmetrised but the old central value is used to calculate the multiplicative error

wilsonmr commented 5 years ago

Ok I would like to refine my list, I have grouped datasets and removed duplicates with _SF suffix

singletop
DATA_ATLAS_SINGLETOP_TCH_DIFF_8TEV_TBAR_RAP_NORM
DATA_ATLAS_SINGLETOP_TCH_DIFF_8TEV_TBAR_PT_NORM
DATA_ATLAS_SINGLETOP_TCH_DIFF_7TEV_TBAR_RAP_NORM
DATA_ATLAS_SINGLETOP_TCH_DIFF_8TEV_T_RAP_NORM
DATA_ATLAS_SINGLETOP_TCH_DIFF_8TEV_T_RAP
DATA_ATLAS_SINGLETOP_TCH_DIFF_7TEV_T_PT_NORM
DATA_ATLAS_SINGLETOP_TCH_DIFF_8TEV_TBAR_RAP
DATA_ATLAS_SINGLETOP_TCH_DIFF_8TEV_TBAR_PT
DATA_ATLAS_SINGLETOP_TCH_DIFF_8TEV_T_PT_NORM
DATA_ATLAS_SINGLETOP_TCH_DIFF_7TEV_TBAR_PT
DATA_ATLAS_SINGLETOP_TCH_DIFF_7TEV_TBAR_PT_NORM
DATA_ATLAS_SINGLETOP_TCH_DIFF_7TEV_TBAR_RAP
DATA_ATLAS_SINGLETOP_TCH_DIFF_7TEV_T_PT
DATA_ATLAS_SINGLETOP_TCH_DIFF_7TEV_T_RAP_NORM
DATA_ATLAS_SINGLETOP_TCH_DIFF_7TEV_T_RAP
DATA_ATLAS_SINGLETOP_TCH_DIFF_8TEV_T_PT
SF sets
DATA_ATLASTOPDIFF8TEVTTMNORM_SF
DATA_CMSJETS11_SF
DATA_CDFR2KT_SF
DATA_ATLASPHT15_SF
DATA_D0ZRAP_SF
DATA_ATLASTOPDIFF8TEVTTRAPNORM_SF
DATA_CMSTTBARTOT_SF
DATA_ATLASTOPDIFF8TEVTRAPNORM_SF
DATA_ATLAS1JET11_SF
DATA_ATLASTOPDIFF8TEVTPTNORM_SF
DATA_ATLASPHT12_SF
other
DATA_LHCBZEE2FB
DATA_D0R2CON
DATA_LHCBW36PB
DATA_CDFZRAP

SINGLETOP and *_SF

I believe all of these can fall under the heading of the central value is shifted by symmeterising the errors but this is done after calculating either the MULT or DATA

other

These have more unique things:

juanrojochacon commented 5 years ago

A quick comment on D0R2CON: this experiment was deprecated long ago, so it can be safely deleted @wilsonmr no need to worry about it. Thanks for the cross-checks with all these filters.

wilsonmr commented 5 years ago

thanks @juanrojochacon I'll bear that in mind

voisey commented 5 years ago

Thanks for this @wilsonmr. I'll have a look and get back to you

voisey commented 5 years ago

Hi @wilsonmr, is there some documentation on how to use the API? I would like to load the single top data in the same way that you did

wilsonmr commented 5 years ago

Hi @voisey Sorry my comments were not the most clear. I had a look at the single top datasets and the inconsistency between the multiplicative columns and additive columns in the commondata files is down to the order in which the data point is shifted e.g:

        symmetriseErrors(up, down, &sigma, &datshift);
        fSys[i][j].mult = sigma;
        fSys[i][j].add = fSys[i][j].mult*fData[i]/100;
        fSys[i][j].type = MULT;
        fSys[i][j].name = "CORR";
        shift[i] += datshift;
      }
    }
    else // Deal with lines that contain no asymmetric errors
    {
      for (int i=0; i<fNData; i++)
      {
        lstream >> fSys[i][j].mult >> unneeded_info;
        fSys[i][j].add = fSys[i][j].mult*fData[i]/100;
        fSys[i][j].type = MULT;
        fSys[i][j].name = "CORR";
      }
    }
  }

  for (int i=0; i<fNData; i++)
  {
    fData[i] *= (1.0 + shift[i]*0.01); // Shift of central value due to asymmetric errors

since the commondata file will have the shifted data point, then clearly doing something like Sys[i][j].mult*Data[i]/100; will not give the same as Sys[i][j].add (if loaded from the commondata file) since the central value has changed. So I am pretty sure that this does fall under the set of datasets with asymmetric uncertainties which have an inconsistency, also it's not instantly obvious how to solve this with multiplicative errors, I think that it should be discussed before we fix it.

I didn't actually use the API for this, If you are curious to see this for youself, then I basically copied and pasted the loader function from NNPDF/nnpdf#476 in a notebook and pointed it at the commondata file in question. There is currently no documentation on this because it's still a WIP. I then used the glob.glob function to get every DATA* file in the commondata directory and load it using the function I copied and pasted. There is documentation on the API in the sphinx docs and if you're interested in a particular dataset then in theory it could be convenient because you could do something like

from validphys.api import API
# resolve dataset_input as CommondataSpec using API
cd = API.commondata(dataset_input={'dataset': 'ATLAS_SINGLETOP_TCH_DIFF_7TEV_T_RAP'})
# copy and paste load_dataset from PR I mentioned
cd_df = load_dataset(cd.datafile)
voisey commented 5 years ago

Thanks for the explanation and for looking into this! Yes, I can see what the problem is with the single top filters now. What do you mean by "it's not instantly obvious how to solve this with multiplicative errors"? Would shifting the data before computing the additive systematics not fix it?

Thanks a lot for the info about the API!

wilsonmr commented 5 years ago

It would fix it in the sense that the columns would be equal.

But basically across datasets there are inconsistencies with how the shift is applied. In the example above the shift and new error have been calculated by putting a percentage into the symmetriseErrors function, the shift has then been applied like D_j * prod_i (1 + shift_i/100) but in other datasets instead the shift has been calculated with the error cast as an additive error and then added to the datapoint and the two produce quite different central values and errors, I believe

voisey commented 5 years ago

Yeah, I see what you're saying. Let's ask about this at the code meeting then

voisey commented 5 years ago

I've just checked this and the central values seem to be the same whether you use multiplicative or additive errors in symmetriseErrors. However, the shifted uncertainty, sigma, that I get for the multiplicative case, call it sigma_mult, corresponds to its additive counterpart, sigma_add, through sigma_add = central_value sigma_mult / 100, not through sigma_add = shifted_central_value sigma_mult / 100 as you would expect

wilsonmr commented 5 years ago

I think I wasn't explicit enough:

case 1: use a percentage in the symmetriseErrors, get shift and new error. In singletop you shift by doing D_j = D_j * prod_i (1 + sigmamult_i/100)

case 2, see D0ZRAP:

symmetriseErrors(up,down,&sys,&shift);

    fSys[i][0].add = sys;
    fSys[i][0].mult = (sys*100.0)/fData[i];

    fSys[i][0].type = MULT;
    fSys[i][0].name = "UNCORR";     //treat sys as uncorrelated

    fData[i]+= shift;     // Shift due to asymmetric error

The shift is added like D_j = D_j + sum_i shift_add_i

for a single uncertainty these give the same central value but clearly for 2 or more they don't, that's because the shift in the mult case is done as a cumalitive percentage, I should say that in the original reference https://arxiv.org/pdf/physics/0403086.pdf they are adding the shifts not cumalitvely multiplying by the percentage, but also I'm not sure if the arguments in the original reference even apply to multiplicative uncertainties.

wilsonmr commented 5 years ago

Both these examples are MULT errors which is why I was saying it's not instantly clear which is correct

wilsonmr commented 5 years ago

This issue is now being dedicated to SINGLETOP and *_SF datasets, which are both concerning using symmetriseErrors and shifting the expected value

mariaubiali commented 5 years ago

@voisey @wilsonmr Just to understand, if we set the default SYSTYPE to MULT, then the fits including single tops should be fine, as the ADDITIVE ones are incorrect, is that right?

mariaubiali commented 5 years ago

Btw, who is supposed to document this? If I read the list of tasks here https://www.wiki.ed.ac.uk/display/nnpdfwiki/Documentation+plan This should be discussed in vp-setup filtering or in buildmaster? I would say buildmaster, that is assigned to @tgiani. What is the status?

Zaharid commented 5 years ago

@mariaubiali The way commondata works (in a why that should be improved at some point...) is that we store the same uncertainty always in two ways: as a percentage (with the factor 100) of the data and as an absolute value. Then the percentage uncertainties are used for sampling and building the covmat from multiplicative systematics, while the absolute values are used for additive systematics. This issue is about the fact that the thing w.r.t. the percentage is computed in these commondata implementations is slightly inconsistent. This is because we shift the central value when we symmetrize errors, but we use some non final version of the central value to compute the percentage. So to answer your question, the systematics that would be affected are the multiplicative ones.

One possible solution is to write a second for loop that writes out the multiplicative versions after all the shifts have been computed.

Another is to change the commondata format and associated code to not write things redundantly and wrongly (we can always use the ratio to rescale for t0).

I do have to say I'd be surprised if this had any real impact, but we should do it correctly anyway.

On top of that, as @wilsonmr points out, nobody has thought very carefully on how the shift should be made when you have correlated systematics.

wilsonmr commented 5 years ago

I agree with Zahari's summary above

On top of that, as @wilsonmr points out, nobody has thought very carefully in how the shift should be made when you have correlated systematics.

This. I have actually been confusing myself with the original reference for symmetrising errors and a slightly more in depth version both by D'agostini and I think I have come to the conclusion that:

I think these issues should be discussed at a phone conference, and so once I have written notes and everyone has discussed and hopefully reached some consensus then we can probably turn some combination of the consensus and notes into documentation, as well as do any necessary changes.

As Zahari says, if there are to be any changes, the impact will probably be minimal because the difference is in the 'semi-diff' terms which should be comparitively small compared to the 'average' terms

voisey commented 4 years ago

Hi @wilsonmr, did you get any further on your note on this? I'm hoping to fix this for single top now. I opened https://github.com/NNPDF/buildmaster/pull/125 but then realised that this may still be wrong because symmetriseErrors should (perhaps?) take absolute uncertainties, rather than percentage uncertainties. Is this your understanding?

wilsonmr commented 4 years ago

Hi Cameron, sorry I've just not done anything on this because other things have taken precendence.

I don't think there is a simple answer to your question. I personally think the shift should be applied additively because I understand the paper's arguments for doing that. Having said that the paper doesn't really consider asymmetric multiplicative uncertainty, so I'm not certain that applying it like D_j = D_j * prod_i (1 + sigmamult_i/100) is incorrect, it just seems a bit ad hoc - if somebody could write down a similar argument concerning moments of the distribution that showed it was the correct thing to do then I'd be a bit happier.

voisey commented 4 years ago

Hi Michael. No problem at all, I totally understand. Thanks for your help with this!

Zaharid commented 3 years ago

@voisey @wilsonmr @enocera Did we forget about this or is it sorted already? If not perhaps now would be a good time.

wilsonmr commented 3 years ago

This was forgotten about. I think in the end it wasn't clear to me if the symmetrisation should be applied to multiplicative uncertainties but probably if it is we should apply it to the absolute values rather than relative and then make sure the mult columns are self consistent at the end.

It's not a completely trivial task

wilsonmr commented 3 years ago

I will try and remind myself of what's going on

wilsonmr commented 3 years ago

I think the minimal solution, which seems best to me right now is to simply make the columns consistent. Whether we are doing the right thing (and how much of an impact changing that does) should probably be done in the future

In which case I think for now lets do: @voisey's solution of

Would shifting the data before computing the additive systematics not fix it?

and then I don't mind opening an issue regarding whether or not this is the right thing to do.

wilsonmr commented 3 years ago

It's actually rather a lot of work to fix all of this (even in the minimal sense of getting the columns self consistent without changing the implementation) so if it's going to happen we really need to divide and conquer.

I think we should discuss this first thing at the code meeting on Wednesday @scarrazza @Zaharid and work out a plan, and assign some people if we are going to take action.

I then think that before 4.1 this should be properly addressed - it probably makes little difference but at some point we keep bragging about our small uncertainties and that we validated it, but the validation (closure and future testing) relies on the initial implementation actually being correct.

wilsonmr commented 3 years ago

4.0 datasets affected:

So actually this is less than I thought. I need to double check I got everything (there are 40 sets affected but as @enocera points out many are deprecated).

EDIT: it's actually ATLAS_WM_JET_8TEV_PTJ and ATLAS_WP_JET_8TEV_PT which are bugged, the bug is easy to fix but not a priority since they're note used in 4.0

EDIT 2: ATLASZPT8TEVMDIST and ATLASZPT8TEVMDIST actually agree if we use the allclose tolerance instead of the testing tolernace.

wilsonmr commented 3 years ago

ok @voisey and @enocera The number of 4.0 sets affected is even smaller than I originally thought. CMSTTBARTOT should be easy because it's very few datapoints/systematics. CMS_2JET_3D_8TEV appears to be 2 systematics so also pretty easy. D0ZRAP is easy because there appears to only be a single systematic and ATLASPHT15 is only 2 systematics so I will actually just take care of this and request a review from you both.

In terms of maintaining reproducibility: 3.1 only included D0ZRAP and CMSTTBARTOT so I will create a new dataset for these. I suppose the dijets were discussed in an ~external~ non-core paper, as one of the authors @enocera would you like me to create a _40 version of this set? For ATLASPHT15 I think I will just edit the set, since the Emma has left the collaboration and otherwise the set is new.

enocera commented 3 years ago

@wilsonmr Thanks for looking into that so quickly. I would do the following:

wilsonmr commented 3 years ago

excellent, I'll do this ASAP

voisey commented 3 years ago

@wilsonmr I've just gone through all of the open PRs and they all look good to me, thanks a lot for your work on sorting this out! Are there any more datasets that need to be fixed?

wilsonmr commented 3 years ago

I think we can essentially close this, we got all the relevant datasets. We should be aware that if we ever re-introduce a new dataset we should plug it into the test I added and check the test passes.

I suppose if we wanted to be really safe we could make a list of datasets which we don't fit but still have inconsistent commondata columns and then do a warning if these dataset inputs are processed. This would slightly slowdown the code if a blocklist had to be checked every time but the list wouldn't be super long.. @Zaharid what do you think?

Zaharid commented 3 years ago

I think that would be an overkill. Maybe we could have some ugly warnings in e.g. the plotting file labels.

wilsonmr commented 3 years ago

Hmm similar idea but what about a deprecated flag? which raises an error* at the point the plotting file is checked - i.e when grouping happens

*or warning

wilsonmr commented 3 years ago

I'm happy to just close this, it's just given that the old versions of _SF sets sneaked into the 4.0 runcards a deprecation flag might be useful

Zaharid commented 3 years ago

Yes, that would make sense. Probably best done on top #1149.

Zaharid commented 3 years ago

With that said, I am going to close this for the sake of the momentary endorphin rush.