Open slava77 opened 2 years ago
assign reconstruction,xpog
New categories assigned: xpog,reconstruction
@slava77,@jpata,@mariadalfonso,@gouskos,@fgolf you have been requested to review this Pull request/Issue and eventually sign? Thanks
A new Issue was created by @slava77 Slava Krutelyov.
@Dr15Jones, @perrotta, @dpiparo, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks.
cms-bot commands are listed here
Tagging a (semi-)random list of people that might be interested to follow (and might also provide some insight?):
@tvami @echabert @carolinecollard @dvannerom
Seems the idea didn't fly so far, so let me try to initiate the discussion.
Some of the data could be truncated or zeroed out if it is not used.
LocalPoint pos_
. I would need some of the actual users of the data to comment about it.float charge_ /*cluster charge*/; float pathlength_ /*path inside the module*/;
not sure how precise this needs to be for physics purposes, perhaps could be truncated.SiStripCluster
or SiPixelCluster
which themselves contain vectors of ADC data. Doing something about these I guess would be where we could make the larger dent into the on-file size. Perhaps one could consider to use for Strips the SiStripApproximateCluster
recently introduced in https://github.com/cms-sw/cmssw/pull/33546/ ? As for the Pixel one, as it seems at least one analysis is re-running the Pixel CPE to get the track probQ
and probXY
(see failed attempt at: https://github.com/cms-sw/cmssw/pull/36247) I am not sure how much info could be dropped.Here is a summary from the HSCP team:
Let me tag other people who might care about this package: @ViktorKutzner @ssekmen @dvannerom @kai-wei @srimanob @lowette @kdipetri @SlavaValouev
@tvami thanks for reply:
- An option may be to use float16 instead for "float (32)" or "half". There are already float16 used in cmssw.
would somebody of your group be available to study the effect of reducing to float16
?
- As far as I know, it is used in to checking that the hit is within the region of interested (typically excluding the edges)
can't it be checked, e.g., by checking the barycenter of the corresponding cluster? Do you really need post-CPE precision?
Consequently it's safer to keep them.
This is not the right approach. Let's study what is really needed and then trim it down to the bare minimum please. For example could the code that calculates your cluster shape variable be move upstream and one (or more) user-float(s) be added to the data format? About the saturated strips, do you need to know just if there is one or more, or they exact locations?
SiPixelCluster: This is needed as it is for passing it to the CPE to extract the probQ / probXY values
In light of https://github.com/cms-sw/cmssw/pull/36247, also that could be move upstream and save directly the quantities needed for the analysis.
Hello Marco,
Sorry I missed this email which is actually very important for us FCP analyzers.
We need the local position of the hits to be able to ignore "edge hits". Hits near the edge of a module typically show a lower dE/dx and we need to clean these out of our hit sample for a given track. In our analyzer, this looks something like that:
hit_localX.push_back(dedx->pos(i).x());
hit_localY.push_back(dedx->pos(i).y());
where dedx is the DeDxHitInfo object associated to a given track:
const reco::DeDxHitInfo *dedx = NULL;
reco::DeDxHitInfoRef dedxref = dedxH->get(isolatedTrackref.key());
if(!dedxref.isNull()) dedx = &(*dedxref);
The charge and the pathlength are obviously the main info we need for the analysis to work. Now I'm not sure how precise the format under which they are stored must be.
We do not use the SiStripCluster nor the SiPixelCluster collections, we only need to know the detId of the hit (through something like SiStripDetId TrackerDetId(dedxId)
in the strips for instance) to know where in which subdetector/layer the hit is. This is very important to correct for radiation effects.
Cheers, David
type tracking
type trk
isnt this tracking-pog more than the tracker-pdg, i.e. type tracking
? The DPG to my knowledge doesnt use this, although with the current implementation they could possible use it... in case we strip it down, I guess that's not true anymore
Followed by a recent update in the dedx data selection in #36225, I think that it's a reasonable idea to review if all data saved in
DeDxHitInfo
in miniAOD is necessary. This data is saved for a somewhat small fraction of tracks, but the size turns out to be close to 0.4 kB per track (using an average from 100 events from workflow 136.793 from DoubleEG Run2017C).DeDxHitInfo
is a vector (per hit) of the following data, which after compression adds up to 36 bytes per hit (as seen in wf 136.793 )DeDxHitInfoContainer
: { float charge /cluster charge/; float pathlength /path inside the module/; DetId detId; LocalPoint pos; }SiStripCluster
orSiPixelCluster
which themselves contain vectors of ADC dataSome of the data could be truncated or zeroed out if it is not used. @cms-sw/xpog-l2 @cms-sw/tracking-pog-l2 please check and comment (or redirect to experts) if the data reduction can be done.