Mu2e / Offline

Offline software for the Mu2e experiment
Apache License 2.0
8 stars 81 forks source link

StrawDigiBundle members and StrawDigiMC matching #1272

Open edcallaghan opened 4 months ago

edcallaghan commented 4 months ago

This is a requested reconsideration of a data structure implemented in #1245, namely StrawDigiBundle, to avoid deep-copies. This prompted discussions about the necessity of deep-copies and dummy or null objects, which here are addressed in the opposite order, after two bits of context.

The first piece of context is that the purpose of that class is to aggregate a StrawDigi, StrawDigiADCWaveform, and StrawDigiMC, each of which contains partial information relevant to a tracker digi, in one place. Doing so allows for cleaner source code (i.e. simplified loops), and ensures that, in situations in which "a digi" needs to be generically considered, all of the available information is present.

The second piece of context is the nature of StrawDigiMC. This data product contains MC truth information associated with a tracker digi. The actual association between the MC truth and digital data products is implicit: that is, a user analyzing an art event may encounter one of two scenarios (neglecting the existence of StrawDigiADCWaveforms, to simplify the discussion): i) both a StrawDigiCollection and StrawDigiMCCollection are present, in which case they should be of the same size, and the nth element of the latter contains the MC truth info underlying the digital info contained in the nth element of the former. There is, e.g., no pointer inside the StrawDigiMC to its matching StrawDigi. This is typical when analyzing the output of simulation. ii) Only a StrawDigiCollection is present; this is the expectation of what real data will look like. These two situations are very distinct from a user's perspective: they are either doing some sort of debugging / review study and are intentionally looking at the output of simulation, or are performing an "actual analysis" and only using digital information.

As to the presence of null objects (namely, stand-in StrawDigiMCs): #1245 makes life more complicated by allowing for a situation where a user may be viewing the output of a simulation, which contains digis for which no MC truth information exists (because, e.g. the digis came from the real detector). Because the association between MC truth and digital information is implicit via synchronized collections thereof, this can only be handled in one of two ways: i) offer no MC truth information at all; or ii) associate "real detector" digis with dummy StrawDigiMCs which are flagged as not to be interpreted as actual MC truth. #1245 opted for the second option, so as to allow for MC truth matching of simulated digis, when possible. Without "throwing away" MC truth info, it is necessary to implement an explicit association between the MC truth and digital information, either via pointers, art::Assns, or some other mechanism. This is conceptually fine, but i) workflows developed using this functionality would break when analyzing preexisting data for which the association is absent, and ii) further thinking would need be to done about adapting columnar views of the same data, e.g. in StrawDigisFromStrawGasSteps diagnostics information.

As to the data members of StrawDigiBundle being const values vs const references: the use of this class is to move information which is partitioned across different objects as a single unit, and in particular, to do so across different scopes. The data members are const because this object is not meant to support changes to its members: it is only meant to carry them through loops / function calls / etc. But because this involves crossing scope boundaries, they cannot be references. As one example, resolving collisions between newly-simulated and preexisting digis is done in a call from the simulation module, which takes a collection of StrawDigiBundles as input, and returns as output a similar collection, which in general contains newly constructed digi objects. If the members of StrawDigiBundle were references, then the objects underlying the returned StrawDigiBundles will have gone out of scope, and the references hence will be invalid. I've tested that this compiles but is problematic at runtime.

tl;dr: To avoid dummy StrawDigiMCs, we need an explicit association between digital and MC truth info for the tracker. To implement StrawDigiBundle as an aggregate of references, the call graph for collision resolution needs to be restructured, and the nature of referencing preexisting digis (which is optional) must be reworked, so that StrawDigiBundles do not cross scope boundaries.