uhh-cms / cmsdb

CMS related campaigns, processes, cross sections and common definitions for analyses.
GNU General Public License v3.0
1 stars 17 forks source link

Inconsistent names for phase space cuts #35

Open riga opened 4 months ago

riga commented 4 months ago

Datasets sometimes have closed intervals (pt30to50) or open ones (pt50). The latter case usually means pt ≥ 50 so that we could name it pt50toinf.

pro

It could be better to make the phase space of a dataset fully obvious. Especially since there might be cases where a cut is left-sided so that, say, mgg50 would mean mgg ≤ 50 and thus mgg0to50 rather than mgg50toinf.

con

Names are getting longer.


In any case, we have to find a consistent naming scheme. There are already some inconsistencies in the current master, especially in QCD datasets and processes, but also in DY, that need to be fixed once we defined a common scheme.

mafrahm commented 4 months ago

I think the current choice of convention is to always skip the toinf part and always keep the lower interval (e.g. dy_lep_pt0To50 (which is the only process with an interval starting at 0 as far as I see). I have no strong opinion whether to keep the 0to or the toinf, but the 0to part would be shorter and as far as I see there are rather few samples that start at a threshold of zero. Keeping both parts would of course also be an option to be as verbose as possible. This would also clearly separate interval cuts from cuts that pick a singular value (e.g. 1j, selecting events with exactly one jet)

In any case, I agree that we should decide on one convention and start enforcing it.