eic / EICrecon

EIC Reconstruction - JANA based
https://eic.github.io/EICrecon
GNU Lesser General Public License v3.0
6 stars 29 forks source link

Proposal for improved calorimeter cluster associations #1475

Closed wdconinc closed 2 months ago

wdconinc commented 5 months ago

Is your feature request related to a problem? Please describe. Right now, the cluster associations are determined using the following strategy: https://github.com/eic/EICrecon/blob/fa7377c9261724b849e9e6f54a581ebe1fe42522/src/algorithms/calorimetry/CalorimeterClusterRecoCoG.cc#L78-L81 This leaves something to be desired, in particular when studying undesired cluster splitting.

In particular, these issues are hampering studies like: how many clusters are actually composed of two particles that were incorrectly lumped together in one cluster?

Describe the solution you'd like We (well, @AkshayaVijay) is working on an analysis where we go from clusters to hits to hit contributions to MCParticles, and then recursively follow the MCParticle parent relations until an MCParticle has a vertex inside the tracking volume (i.e. up to a maximum radius of the inner radius of the calorimeter). When combined with the energy deposition in the cluster, this could result in a better association of the 'real' MCParticle to be associated with the cluster.

Describe alternatives you've considered Another approach could be DD4hep plugin that keeps tracks of the parents, or rewrites some of the particle decay graph for particles that are created outside the tracking volume so a shower would more easily be able to traced back to its parents.

wdconinc commented 5 months ago

RFC @ruse-traveler @veprbl

veprbl commented 5 months ago

This is implemented in #1396, where we didn't see a change.

wdconinc commented 5 months ago

No, this is different. The changes in #1396 look at ALL contributors (good), and the MCParticle of the highest contribution. This feature request is to allow for potentially different MCParticles per cluster. In particular #1396 still has weight equal to 1: https://github.com/eic/EICrecon/blob/080d9ae7586d32766c0b222165c9b83f5850ecd7/src/algorithms/calorimetry/CalorimeterClusterRecoCoG.cc#L367

What I would like as an outcome here is:

MCParticles:
0. E = 10 GeV, PDG = 22
1. E = 5 GeV, PDG = 11, r_vtx = 70 cm, parent = 0
2. E = 5 GeV, PDG = -11, r_vtx = 70 cm, parent = 0

EcalBarrelScFiClusters:
0. E = 10 GeV

EcalBarrelScFiClusterAssociations:
0. simID = 1, recID = 0, weight = 0.5
2. simID = 2, recID = 0, weight = 0.5
wdconinc commented 5 months ago

Of course, #1396 is a good start towards this since it accesses more of what is needed already. There are two parts missing there, I think:

Right now, the largest energy deposition IS usually by the hit in a first layer so the MCParticle of that contribution is what we want as an assocation. But, that's not necessarily the case: for the barrel imaging calorimeter pixel layers (which I admit being out of scope here), the highest energy deposition will not be in the first pixel layer where a single pixel is hit by the primary, but in a subsequent pixel layer where one pixel is hit by a bunch of secondaries. There, we need to trace back the shower-production-graph to the primary instead of creating an association with (not saved) simID = 2000.

veprbl commented 5 months ago

I think the latest iteration for 1396 is your second bullet point. The first bullet point is not implemented, like you said.

veprbl commented 5 months ago

TBH I don't understand why the status quo method works well. I would expect that the high-energy particle traveling through the first radiation length is not the highest contribution in the shower.

wdconinc commented 5 months ago

I think the latest iteration for #1396 is your second bullet point.

Is it? Maybe it's still local somewhere, but I would have expected at least something like

while (mcp.parents_size() > 0) {
  mcp = mcp.getParents(0);
}

(though in this case, I'd argue that's throwing away the gamma -> e+ e- conversion in an MPGD layer).

veprbl commented 5 months ago

You are right, none of that is implemented in #1396. I'm not sure what it does, anymore.

wdconinc commented 5 months ago

But I think if we want, #1396 could be merged after 24.06 is tagged, and then that's a starting point for those additions?

veprbl commented 5 months ago

I prefer to keep the simple existing version of our code. For #1396 to be accepted, would be nice to have some explanations on what it does and why.

veprbl commented 5 months ago

(my opinion, other people can review according to their best judgement)

ruse-traveler commented 5 months ago

Hi all, this is actually sort of the direction I was thinking of pushing #1396 in 🤣

I think it would be pretty straightforward to extend what's there to allow for multiple associations. But for the weights, would it make sense to calculate the weight using the total contributed energy of a particle rather than the total energy of the particle? For example, I'm not sure we would want to give too much weight to an energetic muon that mips through everything...

And as you both note, currently there's nothing there to walk back to the primaries, but that can be added...

ruse-traveler commented 5 months ago

Also, correct me if I'm wrong, but by default we only keep mc particles produced in the tracking volume, right? And then dd4hep assigns all of the g4hits in the calorimeters to the particle in the decay chain that left the tracking volume?

ruse-traveler commented 5 months ago

(And also agreed on some explanations of what's going on with #1396: I can run some quick checks to get a better feel for how it changes the existing associations...)

wdconinc commented 5 months ago

And then dd4hep assigns all of the g4hits in the calorimeters to the particle in the decay chain that left the tracking volume?

I don't think it's that smart. It assigns the hits to the actual shower particle that created it, but doesn't store that particle. (Need to verify that though.)

ruse-traveler commented 5 months ago

Oh!! I see! If that's the case, then yes: we should definitely constrain the associations to the tracking volume...

simonge commented 5 months ago

From my experience it did associate the hits with the particle that left the tracking area, which was always the primary in my case.

ruse-traveler commented 5 months ago

Interesting! I'll check from my side...

ruse-traveler commented 5 months ago

I can confirm: main does associate back to particles in the tracking region. Not always primaries in the BHCal case, though! (These were made with 18x275 NC DIS events (Q^2 > 100)) assocStatus assocVXvsVY