lhcb / starterkit-lessons

Lessons taught at the Starterkit workshops.
https://lhcb.github.io/starterkit-lessons/
Other
34 stars 91 forks source link

Turbo Data Flow #24

Closed goi42 closed 7 years ago

goi42 commented 7 years ago

In the "Changes to Data Flow in Run II" lesson, we show the Turbo as a way to bypass the stripping step. While updating the "An Introduction to LHCb Software" lesson, I noticed that MC/2016/27163002/Beam6500GeV-2016-MagDown-Nu1.6-25ns-Pythia8/Sim09b/Trig0x6138160F/Reco16/Turbo03/Stripping28NoPrescalingFlagged/ALLSTREAMS.DST in the bookkeeping lists Stripping28 under Turbo.

I want to make sure I teach this right. Does this mean the Turbo stream was "resurrected" with Tesla and then included in the Stripping28 campaign? If so, should we update the data flow diagram in the lesson?

saschastahl commented 7 years ago

For MC you run Tesla (create Turbo stream output) and DaVinci (create Stripping output) subsequently. This is to have all information on one simulated sample. For data this subsequent running does not exist.

goi42 commented 7 years ago

Should we update the "Changes to Data Flow in Run II" flow diagram then? It shows Turbo bypassing stripping completely.

saschastahl commented 7 years ago

I think we should leave it as this is what happens for data.

saschastahl commented 7 years ago

Reading the changes to data flow lesson it is a bit inaccurate. The biggest saving in terms of storage of the Turbo stream is to tape not to disk. The disk footprint is similar for Turbo and mDST. But it allows us to better use our overall computing resources.

goi42 commented 7 years ago

Forgive my ignorance here, but why would someone make stripped Turbo MC if there is no stripped Turbo data?

saschastahl commented 7 years ago

There are use cases where a MC sample is useful for analyses based on Turbo lines and Stripping lines. In this way you get the information of both. And it only makes sense if the Stripping is running in flagging mode.

alexpearce commented 7 years ago

I think the number of people who find having both Turbo and Stripping data in the same file useful is very small.

The benefit is that the steps that are used to define MC productions don't need to depend on whether the analysis is a Turbo analysis or not, you only need to define the event type (and maybe some other generator-level specifics).