SBNSoftware / icarus-production

The repository is intended to support ICARUS production activities
GNU General Public License v3.0
0 stars 0 forks source link

Define policy for location and folder name of productions #8

Open mt82 opened 5 months ago

mt82 commented 5 months ago

Policy on location and name of the folder where files of a campaign will be stored:

MFC: Actually this happens depending on which campaign was cloned. What about template campaigns (e.g. mc, run2, ecc…) to be cloned from?

mt82 commented 5 months ago

I am trying to figure out the campaign parameters that impact on the disk location of the campaign

mt82 commented 5 months ago

There are the active campaigns and their configurations

#########################################################################################################################
# MC
#########################################################################################################################

# 2024A_ICARUS_simulation_genie_icarus_bnb_volDetEnclosure_MC_CV_Sys
# /exp/icarus/app/poms_test/cfg/icarus_test_withLArCVforMC_mateusc.cfg
# [g4step2_detsim_reco1_reco2_caf]
/pnfs/sbn/data/sbn_fd/poms_%(prodstatus)s/%(campaign)s/%(sample)s/%(filetype)s/reconstructed/icaruscode_%(version)s/XXX/                                            #stage1, caf, flatcaf, calib, stage0_suppl, larcv

# 2024A_ICARUS_BNB_nue_volDetEnclosure_MC
# /exp/icarus/app/poms_test/cfg/icarus_test_withLArCVforMC_drielsma.cfg
# [stage_detsim_reco1_reco2_caf]
/pnfs/%(experiment)s/scratch/users/%(experiment)spro/dropbox/mc1/poms_%(prodstatus)s/%(campaign)s/%(version)s/stage1/
/pnfs/sbn/data/sbn_fd/poms_%(prodstatus)s/%(campaign)s/%(sample)s/%(filetype)s/reconstructed/icaruscode_%(version)s/XXX/                                            #caf, flatcaf, calib, stage0_suppl

# NuMI_MC_reprocess_LArCV_wRUCIO_SLAC
# /exp/icarus/app/poms_test/cfg/icarus_run2_larcv_rucio_test_MC.cfg
# [larcv]
/pnfs/%(experiment)s/scratch/users/%(experiment)spro/data/poms_%(prodstatus)s/dropbox/icaruscode_%(version)s/%(prodtype)s/%(sample)s/%(prodstatus)s/larcv/

#########################################################################################################################
# DATA
#########################################################################################################################

# 2024A_ICARUS_Run2_Reprocess_DATA_v09_89_01_bnbmajority
# /exp/icarus/app/home/icaruspro/cfg/icarus_run3_keepup_production.cfg
# [stage1_caf_larcv_stage1onDiskOFF]
/pnfs/%(experiment)s/scratch/users/%(experiment)spro/dropbox/data/production/%(experiment)s_Icaruspro_2024_Run2_Reprocessing_V2/%(version)s/offbeambnbmajority/XXX/ #stage1, larcv
/pnfs/sbn/data/sbn_fd/poms_%(prodstatus)s/%(filetype)s/%(prodtype)s/reconstructed/icaruscode_%(version)s/%(streamname)s/XXX/                                        #calib, (flat)caf_blind, (flat)caf_unblind, (flat)caf_prescaled    

# 2024_Run3_Run11816_OpticalReconstruction_WG_offbeambnbminbias (+CloneCampaign_FT_test_DATA_2)
# /exp/icarus/app/home/icaruspro/cfg/icarus_run3_keepup_production_allstreams.cfg
# [stage1_caf_larcv]
/pnfs/%(experiment)s/scratch/users/%(experiment)spro/dropbox/data/poms_%(prodstatus)s/%(experiment)s_%(sample)s/%(version)s/alldatastreams/stage1/
/pnfs/sbn/data/sbn_fd/poms_%(prodstatus)s/%(filetype)s/%(prodtype)s/reconstructed/icaruscode_%(version)s/alldatastreams/XXX/                                        #calib, (flat)caf_blind, (flat)caf_unblind, (flat)caf_prescaled
/pnfs/%(experiment)s/scratch/users/%(experiment)spro/dropbox/data/poms_%(prodstatus)s/%(experiment)s_%(sample)s/%(version)s/alldatastreams/larcv/

#########################################################################################################################
# KEEPUP
#########################################################################################################################

# icarus_keepup_Physics_allstreams_Run3
# /exp/icarus/app/home/icaruspro/cfg/icarus_run3_keepup_production.cfg
# [stage1_caf_larcv_stage1onDisk]
/pnfs/sbn/data_add/sbn_fd/poms_%(prodstatus)s/%(filetype)s/%(prodtype)s/reconstructed/icaruscode_%(version)s/%(sample)s/%(streamname)s/stage1/
/pnfs/sbn/data/sbn_fd/poms_%(prodstatus)s/%(filetype)s/%(prodtype)s/reconstructed/icaruscode_%(version)s/%(streamname)s/XXX/                                        #calib, (flat)caf_blind, (flat)caf_unblind, (flat)caf_prescaled, larcv 

Grouping the destination by path:

**/pnfs/sbn/data**
[MC    ] /pnfs/sbn/data/sbn_fd/poms_%(prodstatus)s/%(campaign)s/%(sample)s/%(filetype)s/reconstructed/icaruscode_%(version)s/XXX/                                            #stage1, caf, flatcaf, calib, stage0_suppl, larcv
[MC    ] /pnfs/sbn/data/sbn_fd/poms_%(prodstatus)s/%(campaign)s/%(sample)s/%(filetype)s/reconstructed/icaruscode_%(version)s/XXX/                                            #caf, flatcaf, calib, stage0_suppl
[KEEPUP] /pnfs/sbn/data/sbn_fd/poms_%(prodstatus)s/%(filetype)s/%(prodtype)s/reconstructed/icaruscode_%(version)s/%(streamname)s/XXX/                                        #calib, (flat)caf_blind, (flat)caf_unblind, (flat)caf_prescaled, larcv 
[DATA  ] /pnfs/sbn/data/sbn_fd/poms_%(prodstatus)s/%(filetype)s/%(prodtype)s/reconstructed/icaruscode_%(version)s/%(streamname)s/XXX/                                        #calib, (flat)caf_blind, (flat)caf_unblind, (flat)caf_prescaled   
[DATA  ] /pnfs/sbn/data/sbn_fd/poms_%(prodstatus)s/%(filetype)s/%(prodtype)s/reconstructed/icaruscode_%(version)s/alldatastreams/XXX/                                        #calib, (flat)caf_blind, (flat)caf_unblind, (flat)caf_prescaled

**/pnfs/sbn/data_add**
[KEEPUP] /pnfs/sbn/data_add/sbn_fd/poms_%(prodstatus)s/%(filetype)s/%(prodtype)s/reconstructed/icaruscode_%(version)s/%(sample)s/%(streamname)s/stage1/

**/pnfs/%(experiment)s/scratch**
[MC    ] /pnfs/%(experiment)s/scratch/users/%(experiment)spro/dropbox/mc1/poms_%(prodstatus)s/%(campaign)s/%(version)s/stage1/
[MC    ] /pnfs/%(experiment)s/scratch/users/%(experiment)spro/data/poms_%(prodstatus)s/dropbox/icaruscode_%(version)s/%(prodtype)s/%(sample)s/%(prodstatus)s/larcv/
[DATA  ] /pnfs/%(experiment)s/scratch/users/%(experiment)spro/dropbox/data/production/%(experiment)s_Icaruspro_2024_Run2_Reprocessing_V2/%(version)s/offbeambnbmajority/XXX/ #stage1, larcv
[DATA  ] /pnfs/%(experiment)s/scratch/users/%(experiment)spro/dropbox/data/poms_%(prodstatus)s/%(experiment)s_%(sample)s/%(version)s/alldatastreams/stage1/
[DATA  ] /pnfs/%(experiment)s/scratch/users/%(experiment)spro/dropbox/data/poms_%(prodstatus)s/%(experiment)s_%(sample)s/%(version)s/alldatastreams/larcv/

Grouping by type/stage

#########################################################################################################################
# MC
#########################################################################################################################

[caf, flatcaf, calib, stage0_suppl]
/pnfs/sbn/data/sbn_fd/poms_%(prodstatus)s/%(campaign)s/%(sample)s/%(filetype)s/reconstructed/icaruscode_%(version)s/XXX/

[stage1]
/pnfs/sbn/data/sbn_fd/poms_%(prodstatus)s/%(campaign)s/%(sample)s/%(filetype)s/reconstructed/icaruscode_%(version)s/stage1/
/pnfs/%(experiment)s/scratch/users/%(experiment)spro/dropbox/mc1/poms_%(prodstatus)s/%(campaign)s/%(version)s/stage1/

[larcv]
/pnfs/sbn/data/sbn_fd/poms_%(prodstatus)s/%(campaign)s/%(sample)s/%(filetype)s/reconstructed/icaruscode_%(version)s/larcv/
/pnfs/%(experiment)s/scratch/users/%(experiment)spro/data/poms_%(prodstatus)s/dropbox/icaruscode_%(version)s/%(prodtype)s/%(sample)s/%(prodstatus)s/larcv/

#########################################################################################################################
# DATA
#########################################################################################################################

[calib, (flat)caf_blind, (flat)caf_unblind, (flat)caf_prescaled]
/pnfs/sbn/data/sbn_fd/poms_%(prodstatus)s/%(filetype)s/%(prodtype)s/reconstructed/icaruscode_%(version)s/%(streamname)s/XXX/
/pnfs/sbn/data/sbn_fd/poms_%(prodstatus)s/%(filetype)s/%(prodtype)s/reconstructed/icaruscode_%(version)s/alldatastreams/XXX/

[stage1]
/pnfs/%(experiment)s/scratch/users/%(experiment)spro/dropbox/data/poms_%(prodstatus)s/%(experiment)s_%(sample)s/%(version)s/alldatastreams/stage1/

[larcv]
/pnfs/%(experiment)s/scratch/users/%(experiment)spro/dropbox/data/poms_%(prodstatus)s/%(experiment)s_%(sample)s/%(version)s/alldatastreams/larcv/

#########################################################################################################################
# KEEPUP
#########################################################################################################################

[calib, (flat)caf_blind, (flat)caf_unblind, (flat)caf_prescaled, larcv]
/pnfs/sbn/data/sbn_fd/poms_%(prodstatus)s/%(filetype)s/%(prodtype)s/reconstructed/icaruscode_%(version)s/%(streamname)s/XXX/
mt82 commented 5 months ago

filetype data mc

prodstatus production

prodtype decoder decoder_test keepup production SBN

sample BNB_nue cosmicmu cosmicmuon decoder fast_opt_sim_new genie_bnb genie_overlay intrinsic_nue metadata muon_bnblike nominal_nu_bnb numioffaxis numu_bnb numu_corsika oscillated_nue purity single_electron_bnb single_electronpiplus single_muon_bnb single_photonpiplus single_pizero

version v06_69_01 v06_79_00 v07_02_00 v07_11_00 v08_11_00 v08_12_00 v08_13_02 v08_19_01 v08_22_00 v08_29_00 v08_30_00 v08_34_00 v08_40_00 v08_41_00 v08_45_00 v08_48_00 v08_49_00 v08_56_00 v08_61_00 v09_09_00 v09_28_01_02 v6_81_00 v6_82_00 v_override_me

experiment hypot icarus sbnd

streamname only override_me

mt82 commented 5 months ago

proposal:

1- define a prefix at storage area level: datapools, data2pools, scratch

disk_prefix = /pnfs/sbn/data/sbn_fd
disk2_prefix = /pnfs/sbn/data_add/sbn_fd
scratch_prefix = /pnfs/%(experiment)s/scratch/users/%(experiment)spro

2- define a prefix at campaign level

campaign_prefix = poms_%(prodstatus)s/%(filetype)s/%(campaign)s/%(version)s

3- The resulting final path for each stage will be a concatenation of the above ones:

%(disk_prefix)s/%(campaign_prefix)s/XXX/ 

or

%(disk2_prefix)s/%(campaign_prefix)s/XXX/ 

or

%(scratch_prefix)s/%(campaign_prefix)s/XXX/ 

being XXX = stage1, caf, larcv, ....

mt82 commented 5 months ago

from @mattfcs it seems reasonable to me. Quick comments:

For MC we, technically, run production in "pushes" which is what we use the umbrella name like MC2024A, MC2024B, etc... that's not super straightforward to see right now because we just been constantly running production these last months. But the idea is that 2 MC samples under MC2024B have the same version and same configurations (besides what is specific to that sample). So all I'm saying is that we should be more specific about what the campaign tag is. I mean, it should include the MC202XZ as well as something to uniquely identify that sample. Does this makes more sense?

mt82 commented 5 months ago

The idea is to group together mc productions produced with the same production release, do I understand correctly? If so, we should define another variable (for mc) at a more general level (wrt single campaign). Something like campaigns_flag

mt82 commented 5 months ago

from @mattfcs

mt82 commented 5 months ago

The standard prefix could be put in a separate config file. This file then should be included in the actual config file of the campaign. The include directove should be put in the [global] section.

[global]
includes = /path/to/my.cfg
mt82 commented 3 months ago

proposal discussed with Mateus, Francisco and Matteo:

1- define a prefix at storage area level: datapools, data2pools, scratch

disk_prefix = /pnfs/sbn/data/sbn_fd
disk2_prefix = /pnfs/sbn/data_add/sbn_fd
scratch_prefix = /pnfs/%(experiment)s/scratch/users/%(experiment)spro

2- define a prefix for campaign (i.e. MC2024A) for MC or prodtype (i.e. Run2_Rep) for data:

campaign = MC202XY
prodtype = Run2_Rep

3- define a prefix at sample level

for MC:

sample_prefix = poms_%(prodstatus)s/%(filetype)s/%(campaign)s/%(sample)s/%(version)s

for data:

sample_prefix = poms_%(prodstatus)s/%(filetype)s/%(prodtype)s/%(sample)s/%(version)s

4- The resulting final path for each stage will be a concatenation of the above ones:

%(disk_prefix)s/%(campaign_prefix)s/XXX/

or

%(disk2_prefix)s/%(sample_prefix)s/XXX/

or

%(scratch_prefix)s/%(sample_prefix)s/XXX/

being XXX = stage1, caf, larcv, ....

mt82 commented 3 months ago

Next step:

mt82 commented 3 months ago

@FranciscoTapia61199

here's the more detailed explanation of what to do for the test implementation of the new folder/naming organization we are proposing: