Create Repack WMSpec - Githubissues

hufnagel commented 13 years ago

We need a WMSpec that will run the repacking. It needs two tasks, first the actual repacking and then a merge step.

Each Repack WMSpec is stream specific. Embedded in the WMSpec is the dataset to trigger path mapping for the given stream. This is passed at runtime to Configuration.DataProcessing and returns a valid repacking configuration. This system is not commissioned yet, so for early testing we can also make up a repacking configuration, store it in the ConfigCache and embed the id in the WMSpec.

The after repacking merge step can not be implemented as a standard WMCORE merge step. This is because of error datasets. If repacker size protections kick in, we need to decide at merge time whether the output goes to the normal dataset or an error datasets. The way we'll implement this is with a custom repack merge job splitting algorithm that passes the normal/error dataset decision to the job. At runtime the job then evaluates this flag and configures one of the day normal/error dataset output modules. Both output module to fileset mappings need to be defined in the WMSpec though.

Requires #2481 and #3096

drsm79 commented 13 years ago

metson: Milestone T0 2_0_0 deleted

hufnagel commented 13 years ago

hufnagel: Include the job splitting parameters in the WMSpec, including the jobNamePrefix, which should be "Repack-Run-Stream".

hufnagel commented 12 years ago

hufnagel: To get the ball rolling on this, there are open questions about a missing Config.DP method that gives you a repack config and also in general about how to configure the multiple outputs from repacking.

For the first version of this, you could just use a merge configuration for both the repack and the repackmerge part. The 'repack' job would get a merge configuration from Config.DP with a single output (single dataset), but otherwise it would be configured like a processing job, with normal unmerged output and support for direct-to-merge. We can also use this for developing the support for the error datasets (two output modules in the spec, one for normal, one for error dataset and selecting one of them at runtime based on information passed from the job splitter).

hufnagel commented 12 years ago

hufnagel: Another thing we could also already cover with this 'fake' repack spec is #3124.

evansde77 commented 12 years ago

evansde: I have commit rights in Conf/DP, lets just write a repacking method in there and go with that.

def repack(self, whatArgsGoeHere, *streamers): process = cms.Process("Repackappottamus") ... return process

sfoulkes commented 12 years ago

sfoulkes: Working on this. There are a couple things I guess about that we need to firm up:

Format of this "selectEvents" data structure. I assumed it was a dict where each key is the primary dataset name. The value is a list of select events statements (I think?). The whole dict is passed to the repack Conf/DP method at runtime.
The names of the Repack and RepackMerge job splitting algorithms. I have this hard coded as "T0.Repack" and "T0.RepackMerge". Dirk added code to the JobCreator to allow the use of non-wmcore job splitting algorithms.
The names of the output modules that the Conf/DP method generates. I have it as primaryDataSetName + "output"

I'll have to do 2 patches, one to bring the Repack spec in the T0 repo up to date and one to remove all the assumptions we have in WMBase that all workflows only produce a single primary dataset.

sfoulkes commented 12 years ago

sfoulkes: We should also fix 2949 and verify that the dataset naming in StdBase is OK.

sfoulkes commented 12 years ago

sfoulkes: I'm also unsure of how to error datasets work. I set the merge task up to have two output modules: the regular one and a "MergedError" output module. The primary dataset in the MergeError module has "Error" appened to it. One of these will be turned off at runtime depending on the size of the input data.

sfoulkes commented 12 years ago

sfoulkes: The WMSpec stuff doesn't like complex types as values, so my SelectEvents datastructure isn't going to fly. We'll have to make the repack method in Conf/DP something like: def repack(self, globalTag, **selectEvents)

hufnagel commented 12 years ago

hufnagel: There is already support in ConfigBuilder to pass in dictionaries to select output modules. I haven't gotten around to testing it yet though. Would look something like this

outputs = [ { 'dataTier' : 'RECO', 'selectEvents' : 'HLT:path1, HLT:path2' }, { 'dataTier' : 'RECO', 'selectEvents' : 'HLT:path3, HLT:path4' } ])

However we package this, the content needs to be passed to Config.DP. That is , a list of output modules and for each output module the tier and the selectEvents string.

hufnagel commented 12 years ago

hufnagel: Btw, the determination when to use the errorDataset output isn't on size primarily. Output only goes to the error dataset if we have to breakup a lumi and the decision whether to breakup a lumi is done at job scheduling time. When that decision is made, the information somehow needs to be stored with the job and passed to the runtime environment.

sfoulkes commented 12 years ago

sfoulkes: Replying to [comment:19 hufnagel]:

Btw, the determination when to use the errorDataset output isn't on size primarily. Output only goes to the error dataset if we have to breakup a lumi and the decision whether to breakup a lumi is done at job scheduling time. When that decision is made, the information somehow needs to be stored with the job and passed to the runtime environment.

This decision happens in the RepackMerge splitting algo, right? The spec creates the RepackMerge jobs with normal output modules and error output modules. Then one of the modules is turned off at runtime. We don't have to do anything with error datasets in the Repack tasks themselves, right?

hufnagel commented 12 years ago

hufnagel: Yes, the decision is made in the repackmerge splitting algo, because only it has the relevant Tie0configuration related parameters (whether to split at all and at what thresholds) and the information about which files belong to which lumi and the sizes of each.

The actual repackmerge job itself and it's CMSSW configuration has one output module, the only thing we need to change at runtime is whether the resulting file is accounted for the normal dataset or the error dataset.

Not sure what the best way to do this is. I though (from our previous discussions) that we could overload the single CMSSW output with two output definitions in the spec, one for the normal and one for the error dataset and then remove the one we do not need at runtime (or at the very least only use the one we need).

sfoulkes commented 12 years ago

sfoulkes: Initial Repack Spec is attached. There's not a lot going on, almost everything is already handled by the setupProcessingTask() and setupMergeTask() methods in StdBase. Questions for Dirk:

Verify that the inputs to the spec are as they should be. It's pretty much a ScramArch, CMSSW version, Acquisition Era, parameters to be passed to the job splitting and the SelectEvents data structure.
We create a single output module for each primary dataset in the SelectEvents data structure. That whole thing is passed to the Config.DP method which we'll have to better define.
This doesn't support ConfigCache configs, should probably be added.
Each merge task has two output modules, a regular and an error. We'll have to decide on the best way to handle this.

sfoulkes commented 12 years ago

sfoulkes: Second patch contains changes to WMCore. It's about 50% cleanup, 25% better support for more involved Config.DP configs and 25% support for the more elaborate T0 merging. Dave, could you review this?

sfoulkes commented 12 years ago

sfoulkes: (In 15201) Modify StdBase so that it doesn't assume that all workflows have only run over a single primary dataset. Modify the addMergeTask() method to support error datasets. Minor cleanup in the other specs. Fixes #1796.

From: Steve Foulkes sfoulkes@fnal.gov

hufnagel commented 12 years ago

hufnagel: Still need to look at the changes in the T0 code.

hufnagel commented 12 years ago

hufnagel: De-scope this a bit to get something working more quickly. The version attached is a fully featured repacking, but instead of the repack merge with the active split lumi protections and with error dataset support I am using a standard merge for now.

hufnagel commented 12 years ago

hufnagel: Please review (both #1796 and #3578) by checking that the RunConfig and Tier0Feeder unit tests work

DMWMBot commented 12 years ago

mnorman: Tested in conjunction with #3578

hufnagel commented 12 years ago

hufnagel: (In eae1a75ab7113c5181c44939fb4f781be62f9863) Create Repack WMSpec, fixes #1796

Signed-off-by: Dirk Hufnagel Dirk.Hufnagel@cern.ch

dmwm / T0

Create Repack WMSpec #1796