GeorgeCocks-01 / URSS

0 stars 0 forks source link

Looking at W boson candidates in the data #2

Closed mvesteri closed 2 years ago

mvesteri commented 2 years ago

Hi @GeorgeCocks-01,

I copied another file to /tmp/13TeV_2018_34_Up_EW.root. This one will be too big to put in your home area so I suggest to just read it directly from that location. Once you are in the epp group you will be able to see this file from its normal location in epp storage.

Inside that file there is a TTree at WpIso/DecayTree. This tree contains candidate W boson decays to muons. Unlike the Z events that you were looking at, these events will contain a lot of background.

First task is to plot the mu_PT column. Do you see evidence of any signal?

Then you can look at the mu_PTSUMCONE040 column. This is the sum of the pT of all particles within a Lorentz-invariant "cone" around the muon. The cone radius is defined by delta_R = sqrt(delta_eta**2 + delta_phi**2). In this column the cone radius is delta_R = 0.4.

We refer to variables like this as "isolation". Can you think how this might discriminate between W -> mu nu events and background from QCD (jets)?

GeorgeCocks-01 commented 2 years ago

Hi @mvesteri, I don't seem to have access to /tmp/13TeV_2018_34_Up_EW.root when I try to use uproot on it (Permissionerror Errno13). I can still access the previous .root file we used for the Z decay and I can see the file if I cd into /tmp/ in my terminal.

mvesteri commented 2 years ago

Ah, sorry, I forgot to chmod a+x the file 🤦

Try again

GeorgeCocks-01 commented 2 years ago

Screenshot 2022-07-13 152559 Screenshot 2022-07-13 152619

Here are the plots of both transverse momenta columns. I think in the plot of mu_PT there is a slight bump at ~40GeV (i.e. half the mass of W). As to how the cone pT can be used to discriminate between events and the background, would it be from subtracting the cone pT from the mu_PT?

mvesteri commented 2 years ago

👍 I suggest to make a 2D (scatter, heat, as you prefer) plot of the muon pT versus the log10 of the cone variable.

Then you should see how the cone variable is useful 😉

Indeed in your first plot there is a hint of a bump at ~40 GeV, but it sits on top of a lot of background at this stage.

GeorgeCocks-01 commented 2 years ago

I've had to disregard all data of the cone variable = 0 (it peaked at 0) so log10 would work, but I got this plot muonvscone

I'm not too sure why the scale goes up to 7000GeV but there are two distinct clusters. Would I be right in saying that the small cluster is from the W boson and the large one from the background?

GeorgeCocks-01 commented 2 years ago

Hi @mvesteri, if the graph above is correct then is there another task I can start?

mvesteri commented 2 years ago

Sorry for the delay in responding. I forgot to mention that you need to do log10(max(some_sensible_min_value,isolation)) rather than log10(isolation). Then I'd suggest to plot with a colour map rather than with points.

At the moment I can't see the signal in the plot

GeorgeCocks-01 commented 2 years ago

No worries. I'm a little confused as to what you mean by a colour map though. I can't find a standard one in matplotlib. Also by max(some_sensible_min_value,isolation), do you mean all values of isolation above some_sensible_min_value?

mvesteri commented 2 years ago

For the plotting maybe like this example https://matplotlib.org/2.0.2/examples/pylab_examples/hist2d_log_demo.html

What I mean is that if the isolation is less than some static value you set it to that value. In python, C++, ... there are functions like max(a,b) that work with numerical types.

GeorgeCocks-01 commented 2 years ago

muonvscone

Here's the graph. I've tried a few different values of min but all that changes is the minimum value on the y axis and I couldn't see any behaviour emerging.

mvesteri commented 2 years ago

I suggest to put the min at 0.1 GeV, and make sure that the y-axis range extends slightly below log10(0.1)

GeorgeCocks-01 commented 2 years ago

image

Here's the graph with those parameters

mvesteri commented 2 years ago

Ah, I was misreading the x axis scale. The problem is that all of the interesting stuff is far to the left of the plot. the region of interest is with 0 < muon pT < 100 GeV, very roughly

GeorgeCocks-01 commented 2 years ago

image

There's a slight overlap looking part at ~20GeV, is that the W boson and the background or should it not look like that.

mvesteri commented 2 years ago

Ignore the step at 20 GeV. That is an artefact of the way that the data were processed. You might even start the plot from 20 GeV.

Do you see the feint population of data at pT ~ 40 GeV and at lower isolation values?

That is the signal. The rest, especially at higher isolation, is QCD background, i.e. jets.

Next you might make 1D plots of the muon pT in slices of the isolation. E.g. if you require the isolation to be smaller than 5 GeV you should see a much more pronounced signal feature.

GeorgeCocks-01 commented 2 years ago

image

Here's the graph clipped at isolation = 5GeV.

For the 1D plot, do you mean a plot of counts vs mu_PT but just for one given value of log10(isolation)? e.g. 0.25 as it should then show a peak at the W signal.

mvesteri commented 2 years ago

For the 1D plot: show the counts versus mu_PT but only for events where the isolation is less than 5 GeV.

GeorgeCocks-01 commented 2 years ago

image

I think I finally got it to work, there's a reasonable bump at ~40GeV now among the background.

mvesteri commented 2 years ago

Nice. There is your W signal, still on top of quite a lot of background. You can read of the plot that the W mass is roughly 80 GeV. I'm sure that you can see that extracting a W mass with an uncertainty of 1 part in 10,000 is hard 😄

mvesteri commented 2 years ago

Next step: you could see what the distribution would look like if there was no background: /tmp/13TeV_2018_34_Up_W_Sim09k.root is the same format as the file that you are currently looking at except that these events contain simulated pure W->mu nu signal.

GeorgeCocks-01 commented 2 years ago

Would you mind giving me access to the file, I'm getting the permission denied error again.

mvesteri commented 2 years ago

Oops. It seems I did chmod a+x instead of chmod a+r. it should work now

GeorgeCocks-01 commented 2 years ago

image

image

Great, thanks. Here are the two graphs. Very clear peak at ~40GeV for the single muon pT. As for the cone pT I'm a little confused. I zoomed in further and the peak seems to be perfectly on 0GeV. It then dips and there's another peak at ~0.25GeV. Doesn't the pT of the cone include the pT of the muon itself? So it should be almost the same since there's little to no background in this simulation?

mvesteri commented 2 years ago

👍 the shape of the muon pT for the signal simulation should be fairly intuitive.

The isolation does indeed have a complicated shape. That is because we are summing up the pT of discrete objects. The spike at zero is when the detector didn't find anything in the cone. For charged particles it is quite obvious. Either the pattern-recognition finds a track or it doesn't. For the calorimeter it is a bit fuzzier because we are adding up deposits of energy. The various steps in the distribution are due to the thresholds for reconstructing a single charged particle and a single calorimeter deposit. Above 1 GeV or so, you can see that the distribution becomes smooth.

mvesteri commented 2 years ago

I've now also copied a file with simulated QCD background. You can see how the pT and isolation distributions look different /tmp/13TeV_2017_29r2_Up_QcdBgdPt18GeV_Sim09k.root

With this file you can look at the WpNoMuID/DecayTree path instead.

GeorgeCocks-01 commented 2 years ago

I can't seem to access this file either, would you mind giving me permission again.

GeorgeCocks-01 commented 2 years ago

@mvesteri not sure if you saw my previous comment, I've tried again today and don't have permission still.

mvesteri commented 2 years ago

Sorry, it should now work

GeorgeCocks-01 commented 2 years ago

image

image

Thanks! Here are the graphs of pT and isolation. The pT graph has a sharp peak at ~20GeV, not sure if that's due to some specific particle being made more than others.

mvesteri commented 2 years ago

👍 the steps in the pT distribution are due to artefacts in how the events are simulated and how they are processed. I wouldn't worry about what happens below 20 GeV.

mvesteri commented 2 years ago

Now you could look at some background enriched real data. In the file /tmp/13TeV_2018_34_Up_EW.root there is a tree under WpNoMuID/DecayTree. These events are selected in the same way as the W->munu signal candidates except that we don't require the muon to be identified. Therefore, the "muon candidates" are mostly pions and kaons.

It will be interesting to see how consistent these data are with the simulation of the QCD background.

GeorgeCocks-01 commented 2 years ago

image

image

The real data is very consistent, the only real difference I can find is that there is no artefact below ~20GeV on the mu_PT graph. The fact that these muon candidates aren't muons doesn't make much difference then? I was also wondering if the data is enriched via QCD simulations (like those in the last task) or with real background taken from experiments.

GeorgeCocks-01 commented 2 years ago

@mvesteri is there another task for me to do?

mvesteri commented 2 years ago

The next step is to make a thorough comparison of the simulated QCD background with the background enriched data.

E.g. if you require the "muon" pt to be above 20 GeV (to avoid the weird processing artefacts) do the shapes of the isolation distributions agree? I suggest to make a plot including the shapes of both of them. I suggest to just scale both histograms to have unit area (that's what I mean by "shape").

GeorgeCocks-01 commented 2 years ago

image

image

The isolation of the real data peaks slightly more, but other than that the two are very similar. As for the pT, the peaks are misaligned. I'm not sure if that's due to the artefact or just how the data is. The peak is again higher for the real data.

GeorgeCocks-01 commented 2 years ago

image

Sorry, just re-read the task and realised you said to start it at 20GeV. The pT are now perfectly aligned.

mvesteri commented 2 years ago

👍 and the plot of the isolation should also be made with the "cut" pT > 20 GeV 😉

GeorgeCocks-01 commented 2 years ago

image

Ah sorry! The isolations from 20GeV are actually slightly off

mvesteri commented 2 years ago

Sorry I mean't plot the isolation (without any cut on the isolation) with a cut on the muon pT [at 20 GeV] 😉

GeorgeCocks-01 commented 2 years ago

image

image

So like this?

mvesteri commented 2 years ago

Great, yes. So indeed the modelling of the isolation isn't perfect.

Can you make the plot of the isolation in coarse "slices" of pT? It would be interesting to see if the mis-modelling changes qualitatively with pT.

GeorgeCocks-01 commented 2 years ago

Do you mean plotting the same axes but with a much smaller range of pT? Then doing this multiple times for different pT values?

mvesteri commented 2 years ago

yes, e.g for 20 < pt < 25, 25 < pt < 30, etc...

GeorgeCocks-01 commented 2 years ago

image

image

image

image

Here are 4 subplots of isolation at different values. It seems to get noisier as the value of pT gets larger

mvesteri commented 2 years ago

Sorry, I mean't to plot the isolation over the same wide range (e.g. 0 - 140 GeV, as you had before) but you make the plots with different ranges of the muon pT.

GeorgeCocks-01 commented 2 years ago

image

image

image

image

No worries! Here are another 4 of the pT instead. Starting from 20GeV. Again though it looks like the higher the GeV, the more noise there is.

GeorgeCocks-01 commented 2 years ago

@mvesteri are those graphs ok now?

mvesteri commented 2 years ago

No, wires still crossed 😆

I'm after a set of plots that all look like this one, except that the histograms only include events with the muon pT within some range.

GeorgeCocks-01 commented 2 years ago

image

image

image

Ah I understand now. Hopefully these look good. The signal gets much worse as the muon pT value is increased.

mvesteri commented 2 years ago

Interesting plots. Can you think why the simulation is failing (to describe the shape of the isolation distribution) at higher muon (hadron, in reality) pT?