DUNE / ND_CAFMaker

Code for running ND parameterized reconstruction and making CAFs
Apache License 2.0
1 stars 10 forks source link

Duplicate and "phantom" G4 true particles in MC dataset #71

Open cuddandr opened 1 month ago

cuddandr commented 1 month ago

The list of true secondary particles contains duplicate entries/particles with (apparently) valid information (G4ID, PDG, etc.) and some "phantom particles". These phantom particles are filled with their default values giving a G4ID of -1 and a PDG code of 0 (along with a NaN momentum). Reconstructed particles occasionally have these phantom particles as their true particle match.

This can be seen by iterating over the sr->mc.nu[idx].sec vector in the CAFs for a given truth interaction, and can be seen in the MiniRun5 beta2a CAFs. So far this has only been seen in the true secondary particle list. Examples of duplicated G4 particles and these "phantom particles" can be seen in the attached output with the files and spill numbers.

caf_bug_output_20240626.txt

noeroy commented 1 month ago

I think this is due to the Ids of the secondary that are used in the MLreco passthrough particles here https://github.com/DUNE/ND_CAFMaker/blob/main/src/reco/MLNDLArRecoBranchFiller.cxx#L531-L533 it should be gen_id for both. what's happening is that 1st we create a new secondary because track_id doesn't match any particle in the list of already known secondaries. and then when we want to assign a Geant4 ID to that particle, we assign it gen_id here: https://github.com/DUNE/ND_CAFMaker/blob/main/src/reco/MLNDLArRecoBranchFiller.cxx#L278

that induces duplicates because each time we face a secondary, we see compare it's track_id to the gen_id of previous secondaries so obviously that makes a new particle.

I believe the -1 are created similarly for the cases where we want to tie a reconstructed object to a true particle and we end up creating an empty secondary instead.

That has been corrected in that branch thatt is being prepared for future updates, we are waiting for some aditional information passthrough MLReco to finish its validation and create a Pull request https://github.com/DUNE/ND_CAFMaker/blob/bugfix/ancestor_id_mlreco/src/reco/MLNDLArRecoBranchFiller.cxx

Previously on some test file we had:

root [1] cafTree->GetEntries("rec.mc.nu.sec.G4ID == -1")
(long long) 104

with the same file after that bugfix:

root [3] cafTree->GetEntries("rec.mc.nu.sec.G4ID == -1")
(long long) 0
sindhu-ku commented 1 month ago

Can you check if this changed anything for number of primaries as well?