key4hep / EDM4hep

Generic event data model for HEP collider experiments
https://cern.ch/edm4hep
Apache License 2.0
25 stars 35 forks source link

Names of association types #344

Closed tmadlener closed 1 month ago

tmadlener commented 1 month ago

This is a discussion issue for settling the naming of the association types in EDM4hep. The names should be valid even after templated associations (https://github.com/AIDASoft/podio/pull/257) are available. To facilitate decision making, I will put dedicated questions into comments, which can then be voted on via either :+1: or :-1:. Feel free to add more questions if necessary.

As a brief overview, currently we have the following association types. In the following I assume that we have adopted the naming convention for from and to that is proposed in #341)

name from to
MCRecoParticleAssociation ReconstructedParticle MCParticle
MCRecoCaloAssociation CalorimeterHit SimCalorimeterHit
MCRecoTrackerAssociation TrackerHit SimTrackerHit
MCRecoCaloParticleAssociation CalorimeterHit MCParticle
MCRecoClusterParticleAssociation Cluster MCParticle
MCRecoTrackParticleAssociation Track MCParticle
RecoParticleVertexAssociation Vertex ReconstructedParticle

341 would give all of these an expilcit direction. Once we have https://github.com/AIDASoft/podio/pull/257 available, this direction would become much more implicit because people can do something like

auto assoc = MCRecoParticleAssociation{};
auto mc = assoc.get<MCParticle>();
auto rec = assoc.get<ReconstructedParticle>();

If we want to keep the explicit directions meaningful we would need to introduce associations for both directions. (Note that previously we didn't need to do this, because from and to were sim and rec, so there was no explicit direction).

Tagging a few people for enagagement: @gaede @jmcarcell @BrieucF @andresailer @Zehvogel @hegner

tmadlener commented 1 month ago

Should we have associations that have explicit directions in both directions? E.g. have an (equivalent of)

edm4hep::MCRecoParticleAssociation:
  OneToOneRelations:
    - MCParticle from
    - ReconstructedParticle to

edm4hep::RecoMCParticleAssociation:
  OneToOneRelations:
    - ReconstructedParticle from
    - MCParticle to
tmadlener commented 1 month ago

Should the names of the associations contain the full typenames? E.g.

edm4hep::ReconstructedParticleMCParticleAssociation:
  # ...

(see next comment for an alternative)

tmadlener commented 1 month ago

Should we keep the current shortened type names, but make them a bit more consistent and clear? A proposal (add a comment if you have a different suggestion for any of them)

current name proposed name
MCRecoParticleAssociation RecoMCParticleAssociation
MCRecoCaloAssociation CaloHitSimCaloHitAssociation
MCRecoTrackerAssociation TrackerHitSimTrackerHitAssociation
MCRecoCaloParticleAssociation CaloHitMCParticleAssociation
MCRecoClusterParticleAssociation ClusterMCParticleAssociation
MCRecoTrackParticleAssociation TrackMCParticleAssociation
RecoParticleVertexAssociation VertexRecoParticleAssociation
tmadlener commented 1 month ago

Should we shorten the Association part to Assoc?

Result: No to Assoc

gaede commented 1 month ago

As mentioned this morning, I believe that from and to, i.e. a direction makes end-user lives easier, as for example in a one-to-many relation that would have to be created w/ a utility. At that point the user will have to decide then in any case what the from and to direction is. Rather than Association I would use sth. shorter, e.g. Link - not sure why you would shorten the linked type names but then spell out Association every time ? And I agree that Assoc is ugly...

tmadlener commented 1 month ago

I am not sure we need a from and to even in the case of one-to-many relations. In the end as long as the types in the association are distinct, the interface of that utility could look something like

template</* the association collection type probably*/>
struct AssociationNavigator {
  template<typename T>
  std::vector<U> get(const T& obj); // maybe getAssociated?
};

where T and U are the FromT and ToT of the Association. Hence, usage would look something like:

// From MC to reco particles
auto& mc2recoAssocs = event.get<MCRecoParticleAssociationCollection>("MCRecoTruthLink");
auto mc2reco = AssociationNavigator{mc2recoAssocs};
// From reco to MC particles
auto& reco2mcAssocs = event.get<MCRecoParticleAssociationCollection>("RecoMCTruthLink");
auto reco2mc = AssociationNavigator(reco2mcAssocs);

// Usage: First get reco & mc particle collections
auto& mcps = event.get<MCParticleCollection>("MCParticles");
auto& recos = event.get<ReconstructedParticleCollection>("PandoraPFOs");

// retrieve reco particles for a given mc particle
auto relRecs = mc2reco.get<ReconstructedParticle>(mcps[0]);
// retrieve mc particles for a given reco particle
auto relMCs = reco2mc.get<MCParticle>(recos[0]);

The main reason why from and to are important in LCIO is because the LCRelation is effectively type erased, but in the EDM4hep / podio case, we have type information available and from and to information can be encoded in collection names for example.

Zehvogel commented 1 month ago

From a users perspective I found the from and to mostly useful to make sense of the weights in the one-to-many case but I would be also happy with the AssociationNavigator as outlined above...

tmadlener commented 1 month ago

There is a prototype implementation of the AssociationNavigator (https://github.com/AIDASoft/podio/pull/646). It turns out the get method can be done completely untemplated (for distinct types in the association), and in the draft the lookup of associated objects effectively becomes

// retrieve reco particles for a given mc particle
auto relRecs = mc2reco.getAssociated(mcps[0]);
for (const auto& [rp, w] : relRecs) {
  auto momentum = rp.getMomentum();
  // w is the weight of the association that linked the MC and reco particle
}
tmadlener commented 1 month ago

Should we switch from Association to Link?

tmadlener commented 1 month ago

Rather than Association I would use sth. shorter, e.g. Link - not sure why you would shorten the linked type names but then spell out Association every time ? And I agree that Assoc is ugly...

I put Link up for debate. I like it. I think the only shortened name in the comment above is the one that links ReconstructedParticle and MCParticle. That would be ReconstructedParticleMCParticleAssociation instead of RecoMCParticleAssociation. The others all have full type names, I think.

edit: Going from Association to Link would also require some renaming work in AIDASoft/podio#257

tmadlener commented 1 month ago

To bring this discussion to an end and in order to move forward, I think we more or less all agree on the following decisions. Please react to this comment if you are NOT happy with these proposed names.

Hence, I propose the following new names and will implement them in #341 for inclusion in the pre-release tag old name new name
MCRecoParticleAssociation RecoMCParticleLink
MCRecoCaloAssociation CaloHitSimCaloHitLink
MCRecoTrackerAssociation TrackerHitSimTrackerHitLink
MCRecoCaloParticleAssociation CaloHitMCParticleLink
MCRecoClusterParticleAssociation ClusterMCParticleLink
MCRecoTrackParticleAssociation TrackMCParticleLink
RecoParticleVertexAssociation VertexRecoParticleLink

This is I think the minimal change that let's us move forward. Longer term we can still introduce directed links and switch to the templated ones from podio.