eic / EDM4eic

A data model for EIC defined with podio and based on EDM4hep.
https://eic.github.io/EDM4eic/
GNU Lesser General Public License v3.0
3 stars 4 forks source link

Proposal for a Jet Type #88

Open ruse-traveler opened 3 months ago

ruse-traveler commented 3 months ago

Since edm4hep has no jet-specific data type (e.g. see the discussions here and here), we opted to use edm4eic::ReconstructedParticle to hold the jet kinematic information when implementing jet reconstruction in EICrecon. This has obvious shortcomings:

Describe the solution you'd like

The FastJet PseudoJet provides a very nice starting point. A possible way something similar could be implemented in our data model is like so:

  edm4eic::Jet:
    Description:  "A reconstructed jet, inspired by the FastJet PseudoJet"
    Author: "Joan Jet"
    Members:
      - uint32_t nCst // number of constituents in jet
      - float area // jet area [sr]
      - float energy // jet energy [GeV]
      - float bkgdEnergy // background energy density * area [GeV]
      - edm4hep::Vector3f momentum // jet 3-momentum [GeV]
    OneToManyRelations:
      - edm4eic::Jet jets // jets that have been combined to form this jet
      - edm4eic::ReconstructedParticle constituents  // constituents of the jets

Note that there are a couple of intentional design choices here:

The former point places jet reconstruction at the very end of our reconstruction workflow; and the latter point is due to the fact that quantities like zg are often highly analysis-dependent, are frequently algorithmically complex, and there are a lot of them. Personally, my opinion is that these would be better served as functions of jets that users could call during downstream analysis.

Also note that the jets one-to-many relation could be used to indicate things like sub-jets.

Describe alternatives you've considered

An alternative approach could be to design a "jet information" type that runs parallel to the jets and stores all of the information not captured by ReconstructedParticle.

veprbl commented 3 months ago

Looks good to me. Please don't use abbreviations (nCst, bkgd). We don't need to store number of constituents separately, as we know the size of OneToMany relations.

ruse-traveler commented 3 months ago

Fair point!