cms-sw / cmssw

CMS Offline Software
http://cms-sw.github.io/
Apache License 2.0
1.09k stars 4.33k forks source link

edm::OwnVector is deprecated #42734

Open makortel opened 1 year ago

makortel commented 1 year ago

ROOT's new RNTuple columnar data storage is not going to support dynamic polymorphism (as opposed to TTree). One such use case in our data formats is edm::OwnVector<T> that effectively behaves as std::vector<std::unique_ptr<T>>, including the dynamic polymorphism (i.e. can store owning pointers to objects of any class that inherits from T). The purpose of this issue is to list all the current uses of edm::OwnVector, discuss how to address them, and track the progress.

It should be noted than the deployment of the migration needs some thought, as the changes are very likely not backwards compatible (although, strictly speaking, the backwards compatibility of edm::OwnVector is not guaranteed across CMSSW release cycles).

I expect the framework group to do majority of the work, but help from domain experts would also be very welcome.

makortel commented 1 year ago

assign core

cmsbuild commented 1 year ago

New categories assigned: core

@Dr15Jones,@smuzaffar,@makortel you have been requested to review this Pull request/Issue and eventually sign? Thanks

cmsbuild commented 1 year ago

A new Issue was created by @makortel Matti Kortelainen.

@Dr15Jones, @rappoccio, @smuzaffar, @makortel, @sextonkennedy, @antoniovilela can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

makortel commented 1 year ago

1. Can be acted on immediately (nothing is persisted)

Clearly unused, can be cleaned up

edm::OwnVector<SiPixelRecHit>

edm::OwnVector<SiStripRecHit1D>

edm::OwnVector<SiStripRecHit2D>

edm::OwnVector<SiTrackerMultiRecHit>

edm::OwnVector<MTDTrackingRecHit>

edm::OwnVector<FastTrackerCluster>

edm::OwnVector<CSCStripHit>

Looks unused, can likely be cleaned up

edm::OwnVector<SiStripMatchedRecHit2D>

To be migrated to std::vector<std::unique_ptr<T>>

edm::OwnVector<BaseTrackerRecHit>

edm::OwnVector<TrackingRegion>

makortel commented 1 year ago

2. Could be acted on now, but likely causes backwards incompatibilities

edm::OwnVector<CSCRecHit2D>

edm::OwnVector<CSCSegment>

edm::OwnVector<DTSLRecCluster>

edm::OwnVector<DTRecHit1DPair>

edm::OwnVector<DTSLRecSegment2D>

edm::OwnVector<DTRecSegment4D>

edm::OwnVector<GEMCSCSegment>

edm::OwnVector<GEMRecHit>

edm::OwnVector<GEMSegment>

edm::OwnVector<ME0RecHit>

edm::OwnVector<ME0Segment>

edm::OwnVector<RPCRecHit>

The DTRecSegment4D, RPCRecHit, GEMRecHit, and GEMSegment are coupled via MuRecObjBaseProducer (DPGAnalysis/MuonTools/interface/MuLocalRecoBaseProducer.h). Either these four cases of edm::OwnVector need to be migrated together, or the code of MuRecObjBaseProducer is replicated for the std::vector case for the duration of the migration of those classes.

The CSCRecHit2D, DTRecHit1DPair, and RPCRecHit are coupled via MuonDetCleaner (TauAnalysis/MCEmbeddingTools/plugins/MuonDetCleaner.h). Either these three cases of edm::OwnVector need to be migrated together, or code of MuonDetCleaner is replicated for the std::vector case for the duration of the migration of those classes.

makortel commented 1 year ago

3. Dynamic polymorphism is used, need a new solution, backwards incompatibilities pretty much guaranteed

edm::OwnVector<TrackingRecHit>

edm::OwnVector<FastTrackerRecHit>

edm::OwnVector<reco::BaseTagInfo>

edm::OwnVector<reco::PFBlockElement>

edm::OwnVector<pat::UserData>

edm::OwnVector<reco::Candidate>

Discussion

We could develop an std::variant-based "OwnVector" that would provide a similar interface as edm::OwnVector (i.e. polymorphic access to the base class pointer/reference), but where the list of all possible concrete classes is defined explicitly. Need to test the schema evolution works if adding new possible concrete types to the variant. But would this be convenient for large class hierarchies, like TrackingRecHit and reco::Candidate, or should we think of different, more invasive solutions?

If we proceed with the std::variant-based OwnVector, we'd also need a support for std::variant in TTree before deployment.

makortel commented 1 year ago

4. Fireworks uses many of the collections

Some (many?) of the classes described in 2 and 3 were used in Fireworks. I'm not sure Fireworks needs any specific treatment beyond the migration described in 2 and 3. Backwards compatibility likely breaks in both cases.

makortel commented 1 year ago

assign reconstruction,fastsim,xpog,visualization

FYI @cms-sw/trk-dpg-l2 @cms-sw/tracking-pog-l2 @cms-sw/muon-dpg-l2 @cms-sw/muon-pog-l2 @cms-sw/btv-pog-l2 @cms-sw/pf-l2

cmsbuild commented 1 year ago

New categories assigned: fastsim,xpog,reconstruction,visualization

@mdhildreth,@jfernan2,@sbein,@ssekmen,@Dr15Jones,@simonepigazzini,@makortel,@mandrenguyen,@alja,@vlimant,@civanch you have been requested to review this Pull request/Issue and eventually sign? Thanks

makortel commented 1 year ago

For the cases listed in https://github.com/cms-sw/cmssw/issues/42734#issuecomment-1708901672 (migrating bunch of cases from edm::OwnVector<T> to std::vector<T>) I'm thinking if 14_0_X would be a reasonable point for (likely) breaking the backwards compatibility?

Dr15Jones commented 1 year ago

We could develop an std::variant-based "OwnVector" that would provide a similar interface as edm::OwnVector (i.e. polymorphic access to the base class pointer/reference), but where the list of all possible concrete classes is defined explicitly.

A major downside to this design is it inverts the code coupling. Presently, packages using the base class only have to be dependent upon the package with the base class. Using a std::variant would require dependent package to be dependent upon all packages which contain a class that inherits from the base class.

Additionally, adding a new type that inherits from the base class means updating all uses of std::variant.

bendavid commented 1 year ago

Just to remind that edm::OwnVector<TrackingRecHit> is also persisted in MINIAOD (ie wherever a breaking change is introduced, old MINIAOD can't be used to produce NANOAOD in a newer release for example):

Singularity> edmDumpEventContent root://cms-xrd-global.cern.ch//store/mc/RunIISummer20UL16MiniAODv2/DYJetsToMuMu_H2ErratumFix_TuneCP5_13TeV-powhegMiNNLO-pythia8-photos/MINIAODSIM/106X_mcRun2_asymptotic_v17-v2/2820000/A140E69B-7E17-5942-8B41-0DCEB11597ED.root | grep slimmedMuonTrackExtras
edm::OwnVector<TrackingRecHit,edm::ClonePolicy<TrackingRecHit> >    "slimmedMuonTrackExtras"    ""                "PAT"        
edmNew::DetSetVector<SiPixelCluster>    "slimmedMuonTrackExtras"    ""                "PAT"        
edmNew::DetSetVector<SiStripCluster>    "slimmedMuonTrackExtras"    ""                "PAT"        
vector<reco::TrackExtra>              "slimmedMuonTrackExtras"    ""                "PAT"        
Singularity> 
makortel commented 1 year ago

Just to remind that edm::OwnVector<TrackingRecHit> is also persisted in MINIAOD (ie wherever a breaking change is introduced, old MINIAOD can't be used to produce NANOAOD in a newer release for example):

Thanks Josh. Is the reading of a MINIAOD file produced on an earlier release cycle in a newer release cycle something that might become a requirement, or is it just hoped to work? (since strictly speaking backwards compatibility is guaranteed only for RAW)

simonepigazzini commented 1 year ago

it is something that we rely on quite heavily to reduce the number of times we have to start from AOD (or even earlier) in order to produce new NANO

makortel commented 1 year ago

it is something that we rely on quite heavily to reduce the number of times we have to start from AOD (or even earlier) in order to produce new NANO

And this is really across release cycles, rather than backporting the necessary changes to the release cycle where the MiniAOD was produced?

simonepigazzini commented 1 year ago

indeed. For two reason (as far as I can see):

1) it is much easier to add few modifier to a newer release to make sure we can read different MINIAOD as input to NANO rather than backporting features across several releases (think from 13_3 to 12_4, 10_6) 2) Often the changes are actually in the MINIAOD content itself, we can recompute them on the fly in newer releases while backporting would break rule of preserving the event content.

All this being said, if the change is necessary we can workout the right time to do it to minimize the need of accessing older MINIAODs. In general this will require re-making certain MINIAODs campaign starting from AOD in a release in which the OwnVector has been removed.

The main challenge I see here is with Run2 MINIAODs. It is by far easier to read them as they are to make new NANOAODs rather than either backport changes or reproduce all of them starting from AOD.

mmusich commented 1 year ago

Just to remind that edm::OwnVector<TrackingRecHit> is also persisted in MINIAOD

The same comment holds true for almost all Tracker alignment and calibration ALCARECO flavours. In case there is a plan of something like a Run3 UltraLegacy re-reco you better plan it very carefully if people are supposed to re-use the Prompt ALCARECO for that purpose, otherwise a massive rereco just for calibration purposes might be necessary.

mmusich commented 1 year ago

assign alca

cmsbuild commented 1 year ago

New categories assigned: alca

@perrotta,@consuegs,@saumyaphor4252 you have been requested to review this Pull request/Issue and eventually sign? Thanks

makortel commented 1 year ago

Just to remind that edm::OwnVector<TrackingRecHit> is also persisted in MINIAOD

The same comment holds true for almost all Tracker alignment and calibration ALCARECO flavours. In case there is a plan of something like a Run3 UltraLegacy re-reco you better plan it very carefully if people are supposed to re-use the Prompt ALCARECO for that purpose, otherwise a massive rereco just for calibration purposes might be necessary.

Thanks @mmusich for bringing this point up. Could you (or @cms-sw/alca-l2) elaborate why the same release cycle that was used to produce the ALCARECOs could not be used (or would be difficult to use) to derive these calibrations?

mmusich commented 1 year ago

elaborate why the same release cycle that was used to produce the ALCARECOs could not be used (or would be difficult to use) to derive these calibrations?

A variety of reasons, but mostly these two:

VinInn commented 1 year ago

The only drawback I see for std::vector<std::variant<A,B,C>> is the explicit dependency from A,B,C instead of the common base class.

if one knows that A,B,C have a common base class (say Z) runtime (dynamic) polymorphism is trivial (in absence of multiple inheritance). see example below.

At least for track/tracking the design is closed (the base class supports poor-man RTTI for fast down-casting) and most of the derived classes are in a single Data Format package

I hope root will support "schema evolution" in case one adds/deletes a type from a variant.

#include <iomanip>
#include <iostream>
#include <string>
#include <type_traits>
#include <variant>
#include <vector>

struct Z {
  virtual ~Z(){}
  virtual char operator()() const = 0;
};

struct A : public Z {
  ~A() override {}
  char operator()() const override { return 'A';}
};

struct B : public Z {
  ~B() override {}
  char operator()() const override { return 'B';}
};

struct C : public Z {
  ~C() override {}
  char operator()() const override { return 'C';}
};

using ABC = std::variant<A,B,C>;

Z &  toZ(ABC & v) { return *std::visit([](auto&& arg) -> Z* { return (Z*)(&arg);},v);}

using Cont = std::vector<ABC>;

int main() {

  Cont c;
  c.emplace_back(A());
  c.emplace_back(B());
  c.emplace_back(C());

  for ( auto & v : c) std::cout << toZ(v)() << std::endl;

  return 0;
}
makortel commented 1 year ago

Are the aforementioned expectations of MINIAOD and (some?) ALCARECOs to be readable by later release cycles being tested in the IBs?

makortel commented 1 year ago

At least for track/tracking the design is closed (the base class supports poor-man RTTI for fast down-casting) and most of the derived classes are in a single Data Format package

Are we 100 % sure the same classes (like edm::OwnVector<TrackingRecHit>) is not used to store e.g. muon hits/segments or FastSim hits? Or did you refer specifically to the BaseTrackerRecHit sub-hierarchy (which still includes the FastSim classes)?

mmusich commented 1 year ago

Are the aforementioned expectations of MINIAOD and (some?) ALCARECOs to be readable by later release cycles being tested in the IBs?

Yes, e.g. https://github.com/cms-sw/cmssw/blob/a2fe5aab0bfb328573d0d080b45d1c3ad38f2f01/RecoTracker/TrackProducer/test/BuildFile.xml#L33 for MINIAOD. For ALCARECO there's plenty of tests that exercise this (e.g. in Alignment/OfflineValidation).

makortel commented 1 year ago

Looking at 2023 prompt reco, the following collections seem to be used in MINIAOD

edm::OwnVector<TrackingRecHit,edm::ClonePolicy<TrackingRecHit> >
edm::OwnVector<reco::BaseTagInfo,edm::ClonePolicy<reco::BaseTagInfo> >
edm::RangeMap<CSCDetId,edm::OwnVector<CSCSegment,edm::ClonePolicy<CSCSegment> >,edm::ClonePolicy<CSCSegment> >
edm::RangeMap<DTChamberId,edm::OwnVector<DTRecSegment4D,edm::ClonePolicy<DTRecSegment4D> >,edm::ClonePolicy<DTRecSegment4D> >

and the following in AOD (in addition to the ones in MINIAOD)

edm::RangeMap<CSCDetId,edm::OwnVector<CSCRecHit2D,edm::ClonePolicy<CSCRecHit2D> >,edm::ClonePolicy<CSCRecHit2D> >
edm::RangeMap<GEMDetId,edm::OwnVector<GEMRecHit,edm::ClonePolicy<GEMRecHit> >,edm::ClonePolicy<GEMRecHit> >
edm::RangeMap<GEMDetId,edm::OwnVector<GEMSegment,edm::ClonePolicy<GEMSegment> >,edm::ClonePolicy<GEMSegment> >
edm::RangeMap<RPCDetId,edm::OwnVector<RPCRecHit,edm::ClonePolicy<RPCRecHit> >,edm::ClonePolicy<RPCRecHit> >

so already the migration of (half of) the classes in category 2 (https://github.com/cms-sw/cmssw/issues/42734#issuecomment-1708901672) would run into conflict with the aforementioned backwards compatibility expectations.

@cms-sw/xpog-l2 From the core perspective we'd like to try out the migration of the edm::RangeMap<T, edm::OwnVector<T_Muon>> to edm::RangeMap<T, std::vector<T_Muon>> (i.e. the category 2), also as an exercise to see how complicated the migration process would be. The change would break (naive) backwards compatibility for both AOD and MINIAOD for these data products. One mitigation strategy could be to add EDProducers that convert the RangeMap<T, OwnVector<U>> to RangeMap<T, vector<U>> on the fly. Our first question is when such a change could be accommodated? Would it be useful to discuss in one of the future core software (or xpog) meetings?

simonepigazzini commented 1 year ago

Hi @makortel, an in person discussion would indeed be great. Our XPOG meeting on Wednesday 2pm is usually not too busy, we can accommodate a slot for you in an upcoming one (next one is Oct. 2nd).

makortel commented 1 year ago

an in person discussion would indeed be great. Our XPOG meeting on Wednesday 2pm is usually not too busy, we can accommodate a slot for you in an upcoming one (next one is Oct. 2nd).

Unfortunately next week (Oct 4) conflicts with O&C week. In the mean time @Dr15Jones prototyped the migration from the RangeMap<T, OwnVector<U>> in https://github.com/cms-sw/cmssw/pull/42917 . When would the following meeting be? Maybe we can use Chris' PR as a base for the further discussion?

simonepigazzini commented 1 year ago

Hi @makortel, the next meeting is going to be on the 18th at 2pm GVA time. The point we would like to discuss are:

Based on these two points we can either discuss in person at the meeting on the 18th (in case substantial effort will be required by POGs) otherwise we can iterate here and in the relevant PRs.

VinInn commented 1 year ago

I suspect that most of the edm::OwnVector<reco::Candidate> are homogeneous collections and can be replaced by a vector where T inherits from Candidate. One needs some wrapping into a base class that itself does not depend on T (a new ``edm::OwnVector?) and then say a virtual method that returns a vector<reco:::Candidate const*> and/or a virtual operator[] that returns the corresponding reco:::Candidate const & Of course edm should support the ability to consume a product declaring only a base class so that clients can get this new version ofedm::OwnVector``` w/o any knowledge of T.

Dr15Jones commented 1 year ago

Of course edm should support the ability to consume a product declaring only a base class

This already exists via the use of edm::View<T> were T is the base class.

VinInn commented 1 year ago

ah ok excellent. Thought it was necessary to introduce a "translator" producer. So One can save a vector<GenParticle> and a vector<PFCandidate> and consume any of those (or both) in a jetAlgo that consumes edm::View<reco:::Candidate> without introducing any further magic, isn't it?

wddgit commented 9 months ago

How far back is there practical value in maintaining backward compatibility? Is 10_6_X the oldest release which might have been used to create input files for future processes creating NANO from MINIAOD or doing whatever it is that ALCA will be doing or other similar use cases?

Is there some point where this backwards compatibility is already broken?

I ask this understanding that there has always been a hard requirement to be able to read RAW from some very early point in CMS history.

makortel commented 9 months ago

I ask this understanding that there has always been a hard requirement to be able to read RAW from some very early point in CMS history.

Correct, we guarantee backwards compatibility for RAW (that also includes Scouting). And formally, that has been the only backwards compatibility guarantee, and for other data types the backwards compatibility has been on best effort basis (although in practice the schema evolution has worked quite widely).

How far back is there practical value in maintaining backward compatibility? Is 10_6_X the oldest release which might have been used to create input files for future processes creating NANO from MINIAOD or doing whatever it is that ALCA will be doing or other similar use cases?

The scope we have planned so far are creation of NanoAOD(SIM) / MiniAOD(SIM) from MiniAOD(SIM) / AOD(SIM) that were produced in the Run 2 Ultra Legacy processing (done in 10_6_X) in official processing campaigns, and reading of ALCARECOs produced in the same Run 2 UL processing in the scope that those are being read in CMSSW tests.

Is there some point where this backwards compatibility is already broken?

In principle no, because both the Mini/AOD case and ALCARECO case should be being tested in CMSSW. In practice, at least one subtle issue had already creeped in (https://github.com/cms-sw/cmssw/issues/43923), one can always worry about the coverage of the tests.

wddgit commented 9 months ago

PR #43931 deals with the "Clearly unused" cases.

wddgit commented 9 months ago

PR #43987 takes care of OwnVector<SiStripMatchedRecHit2D>

wddgit commented 9 months ago

PR #44047 deals with OwnVector<BaseTrackerRecHit>

wddgit commented 9 months ago

PR #44063 deals with OwnVector<TrackingRegion>