dmwm / WMCore

Core workflow management components for CMS.
Apache License 2.0
46 stars 107 forks source link

Fix base lfn in stdspecs #757

Closed sfoulkes closed 12 years ago

sfoulkes commented 14 years ago

It's currently so weird combination of the T0 and PA conventions.

sfoulkes commented 14 years ago

sfoulkes: Dave, review please? Specifically that I have the convention correct.

drsm79 commented 14 years ago

metson: The convention isn't quite right:

https://twiki.cern.ch/twiki/bin/viewauth/CMS/DMWMPG_Namespace#Top_level_namespace

There should be an lfn counter (to deal with large blocks) and a GUID for a file name at the end (though maybe you're making a base path?), e.g.

/store/data/09_Spring/single_electron/AOD/v2/0132/somerandomhexnumber.root

hufnagel commented 14 years ago

hufnagel: The Tie0follows that convention, except our lfn counter is run based: <r/u/n>, every element being padded to be 3 characters. For MC this is likely not very useful though because we have one run with potentially many many files per run (this will change somewhat with run based MC though).

For data reprocessing it remains a viable option though, although I am not sure if it's feasible/worthwhile to implement something like this or to just say that the Tie0does it's own thing and anything coming later just uses lfn counters.

In any case, I am pretty certain a single 4 digit number will not be sufficient.

Also, having the run number encoded in the LFN is convenient and can even be useful on the technical level (at least for the Tie0 where you can in theory play games with the TFC and do per run access paths, would have allowed us to recover the runs broken due to the latest xrootd bug for instance).

sfoulkes commented 14 years ago

sfoulkes: This is just making the base path. The LFN counter and GUID file name is added at runtime during stage out.

hufnagel commented 14 years ago

hufnagel: Btw, on second look, the twiki is slightly wrong (at least for the Tie0. Twiki lists

/store/data/acquisition_era/primary-dataset/data_tier/processing_version/lfn_counter/filename.root

but it really is

/store/data/acquisition_era/primary-dataset/data_tier/[processing_string-]processing_version/lfn_counter/filename.root

Only if processingString is None does it collapse to the syntax mentioned on the twiki.

For instance, PromptReco files would go to

/store/hidata/HIRun2010/HIAllPhysics/RECO/PromptReco-v3/000/152/625/E2B84CEA-74FA-DF11-B7CF-0030487CD17C.root

This was recently changed by request of DataOps. I think originally we thought this would all be encoded in the processingVersion, but it wasn't used this way and we ended up with LFN space collission between different samples (PromptReco and ReReco).

drsm79 commented 14 years ago

metson: Replying to [comment:5 hufnagel]:

This was recently changed by request of DataOps. I think originally we thought this would all be encoded in the processingVersion, but it wasn't used this way and we ended up with LFN space collission between different samples (PromptReco and ReReco).

Why wasn't it done as agreed?

hufnagel commented 14 years ago

hufnagel: PromptReco and ReReco we could have kept separate this way. The real problem came in when we looked at skims though. To keep separate LFN namespace for a full Reco pass and the corresponding RECO skims (or full AOD and skim AOD samples respectively), we would have had to assign different processingVersions for full processing passes and each skim run after. Mix in multiple full processing passes and out of order skims and you end up with a complete mess.

evansde77 commented 13 years ago

evansde: Before I review/commit this are there any actual changes coming out of the above discussions?

hufnagel commented 13 years ago

hufnagel: I think we should change the conventions to use [processingString-]processingVersion instead of just processingVersion. Then all elements that distinguish a sample (and which are included in the DBS datasetpath) will also be present in the LFN. Yes, I know, LFN is not a namespace and this might encourage users to use it like this. OTOH, there are valid practical operational reasons to keep different samples in different directories.

sfoulkes commented 13 years ago

sfoulkes: Can we resolve this today? I'd like to have correct lfns in the next patch...I have no problems with Dirk's convention.

evansde77 commented 13 years ago

evansde: OK, ill review/commit shortly...

sfoulkes commented 13 years ago

sfoulkes: If we go with Dirk's convention i'll have to tweak the patch.

evansde77 commented 13 years ago

evansde: (In dc0e5067aa210f22a01f76da8576a68fecd9f4bb) From d50bc86572ad8ea5215b3fdfdbcca61aae4bce47 Mon Sep 17 00:00:00 2001 Subject: [PATCH] Fix lfn naming. Fixes #757.

From: Steve Foulkes sfoulkes@fnal.gov

evansde77 commented 13 years ago

evansde: Replying to [comment:12 sfoulkes]:

If we go with Dirk's convention i'll have to tweak the patch.

GDMF!!!!

sfoulkes commented 13 years ago

sfoulkes: Patch attached that uses the Hufnagel convention. Dave, review?

evansde77 commented 13 years ago

evansde: (In ec37a7fa30f9e9dc4ee211016911c4a4090db7ec) From 7b5ae6955563041cca6e0516b23c7ccdf7905436 Mon Sep 17 00:00:00 2001 Subject: [PATCH] Include the processing string in the LFN base. Fixes #757.

From: Steve Foulkes sfoulkes@fnal.gov

Signed-off-by: evansde77 evansde77@gmail.com

drsm79 commented 13 years ago

metson: "The Hufnagel convention" sounds like a film from the 70's

evansde77 commented 13 years ago

evansde: I thought the Hufnagel Convention was like the Geneva Convention but with a better Buffet...

sfoulkes commented 13 years ago

sfoulkes: (In 11040) From d50bc86572ad8ea5215b3fdfdbcca61aae4bce47 Mon Sep 17 00:00:00 2001 Subject: [PATCH] Fix lfn naming. Fixes #757.

From: Steve Foulkes sfoulkes@fnal.gov

sfoulkes commented 13 years ago

sfoulkes: (In 11041) From 7b5ae6955563041cca6e0516b23c7ccdf7905436 Mon Sep 17 00:00:00 2001 Subject: [PATCH] Include the processing string in the LFN base. Fixes #757.

From: Steve Foulkes sfoulkes@fnal.gov

Signed-off-by: evansde77 evansde77@gmail.com