Closed ktht closed 3 years ago
In order to quantify the number of samples where the nominal LHE scale weight (= 4th weight in the LHEScaleWeight
array) is significantly different from 1 and how large of an impact it can have on per sample basis, I dumped the nominal LHE scale weight from all of the samples if the weight differed more than 1% from unity (ie is outside of [0.99, 1.01]). In about 8% of the samples, or in 121 samples, there is at least one such nominal LHE weight that ended up outside of this range. In order to assess how large the impact it has on the normalization of samples, I calculated the relative difference of the nominal LHE scale weight from unity wrt the total event statistics of the sample:
According to this, the impact on the normalization is in the order of 1e-5, so effectively negligible, in all samples except for the single anti-/top samples where the impact varies from 0.13% to 0.19%. After dumping minimum and maximum values of nominal LHE scale weights from every sample, it's clear to me that the extreme outliers happen only in single anti-/top samples. What's even more bizarre is that:
ST_t-channel_antitop_4f_inclusiveDecays
sample);
***********************************************************************
ST_t-channel_antitop_4f_inclusiveDecays
sample);
***********************************************************************
ST_t-channel_antitop_4f_inclusiveDecays
sample).
***********************************************************************
From what I can see, these effects are present in all single anti-/top samples, regardless of the era, which is why I propose the following plan:
edit: ok, looks like ST_t-channel_top_4f_inclusiveDecays
2018 sample is the only single anti-/top sample that is clear from the above issues.
2nd edit: only the t-channel single anti-/top samples are problematic.
I'm done with the coding part of this task. In addition to the above, I also updated the logic of calculating the event sums for individual LHE scale variations. As mentioned before, it has negligible effect on the normalization of the samples, but it's useful to have it implemented now in order to remain consistent in future.
Regarding the extreme outliers in the single anti-/top samples, some of the same samples have quite broad distribution of gen weights as well, so maybe there's a connection (see plots referenced in this comment).
Also, one event from 2018 DYToLL_2J
sample apparently has a negative nominal LHE scale weight:
***********************************************************************
* Row * Instance * run * luminosit * event * LHEScaleW *
***********************************************************************
* 63183 * 0 * 1 * 25153 * 75456888 * -1.630554 *
* 63183 * 1 * 1 * 25153 * 75456888 * -1.587341 *
* 63183 * 2 * 1 * 25153 * 75456888 * -1.607727 *
* 63183 * 3 * 1 * 25153 * 75456888 * -0.990112 *
* 63183 * 4 * 1 * 25153 * 75456888 * -0.941528 *
* 63183 * 5 * 1 * 25153 * 75456888 * -0.943389 *
* 63183 * 6 * 1 * 25153 * 75456888 * -0.634063 *
* 63183 * 7 * 1 * 25153 * 75456888 * -0.588104 *
* 63183 * 8 * 1 * 25153 * 75456888 * -0.582275 *
***********************************************************************
Although the sign factors out in the end, the minimum/maximum values are taken with the sign, so there's a slight inconsistency. It can be handled in LHEInfoReader
but I don't think it's going to make much of a difference.
Previously I verified that the 4th LHE weight in all events and samples was effectively equal to 1. However, in bbww analysis it was discovered that in some events this is not the case. I'll revisit the issue again. If it turns out that we also have the same problem, then I'll probably compute the shifts in LHE weight as ratio of the shift to the 4th weight (ie as
LHEScaleWeight[i] / LHEScaleWeight[4]
withi
running from 0 to 8) and rerun post-production of the affected samples in order to recompute the event sums.