Closed paorozo closed 6 years ago
Okay thanks, like I said, I hadn't realized that was separate from ACDC. Can you give me a list of parameters that you would like for Recovery? Is this also split by tasks?
The recovery procedure involves three phases: In phase 1 we create a json files with the specifications of the new requests we are going to create to recover the missing lumis for one workflow. We will get one json file per datatier. There is an example of the json file:
{"createRequest": {"InitialTaskPath": "/fabozzi_HIRun2015-HIMinimumBias1-02May2016_758p4_170306_123923_2547/DataProcessing", "CollectionName": "fabozzi_HIRun2015-HIMinimumBias1-02May2016_758p4_170306_123923_2547_a752b4de-28d9-11e7-b427-02163e00f196", "PrepID": "ReReco-HIRun2015-02May2016-0008", "Group": "DATAOPS", "RequestPriority": 900000.0, "ACDCDatabase": "acdcserver", "Memory": 9000, "Requestor": "prozober", "SizePerEvent": 300, "RequestString": "recovery-0-fabozzi_HIRun2015-HIMinimumBias1-02May2016_758p4_", "IgnoredOutputModules": [], "ACDCServer": "https://cmsweb.cern.ch/couchdb", "OriginalRequestName": "fabozzi_HIRun2015-HIMinimumBias1-02May2016_758p4_170306_123923_2547", "Campaign": "HIRun2015", "RequestType": "Resubmission", "TimePerEvent": 6}, "changeSplitting": {"DataProcessing": {"SplittingAlgo": "LumiBased", "halt_job_on_file_boundaries": "True", "lumis_per_job": 1}}, "assignRequest": {"SiteWhitelist": ["T1_DE_KIT", "T2_CH_CERN_HLT", "T1_FR_CCIN2P3", "T1_ES_PIC", "T2_US_MIT", "T2_IT_Legnaro", "T2_UK_London_Brunel", "T2_BE_IIHE", "T0_CH_CERN", "T2_IT_Pisa", "T2_CH_CERN"], "ProcessingVersion": 1, "MaxRSS": 2411724, "ProcessingString": "02May2016", "Dashboard": "reprocessing", "Team": "production", "UnmergedLFNBase": "/store/unmerged", "MergedLFNBase": "/store/hidata", "MaxVSize": 20411724, "OpenRunningTimeout": 0, "AcquisitionEra": "HIRun2015"}}
In phase 2 we inject the requests into requestManager2. We use reqMgrClient to do that:
python reqMgrClient.py -j <file.json>
In phase 3, we assign the requests.
So, the parameters we need to take into account are:
Just a reminder, we need to include the "Recovery" option, besides ACDC and Kill and Clone.
Currently, we create recoveries as follows: https://twiki.cern.ch/twiki/bin/viewauth/CMS/CompOpsPRWorkflowTrafficController#Recovering_Workflows