dmwm / CMSRucio

6 stars 31 forks source link

Meta data for tape family #323

Open drkovalskyi opened 1 year ago

drkovalskyi commented 1 year ago

For optimal tape storage performance, the data should be grouped in tape families, which are likely to be staged out at the same time. In most cases storage admins are looking at the directory structure of CMS logical file names and guessing what should be the optimal grouping of data into tape families. We can provide additional information in a form of meta data to make these decisions better. We need to find a way to store this information and make it available to the sites in a convenient way. It may require passing information via FTS or via file transfer protocols.

Examples of what meta data to pass around:

Exact usage of information will need to be discussed and coordinated between storage admins, computing operations and PPD.

aperezca commented 1 year ago

Hi Dima,

At PIC we typically look at the type of data (mc or data in our case, rarely any HI), primary dataset and then data tier. We extract that information from the LFN, so the process would be made easier for us if, for every pending rucio rule, we immediately got the list of "base LFN structure" strings for each of the datasets in the rule (e.g. /store/data/Run2022B/Tau/RAW/ or /store/mc/Phase2HLTTDRWinter20GS/MinBias_TuneCP5_14TeV-pythia8_pilot/GEN-SIM/).

As it is now, we need to obtain the pending rules, get their corresponding containers, then datasets, full file LFN dump, then get common LFN structure. Then, we can assign proper storage tags to each destination PFN (e.g. tape family and tape family width, i.e. number of tapes to write in parallel). We could save some cycles if that information ("base LFN structure") was already provided to us with the pending rule ID.

(of course, I'm not a Rucio expert, so there are probably more straightforward ways to obtain the data we need that what I described)

DAMason commented 1 year ago

At FNAL we've moved to grouping MC by year, and then data by run era and whether raw or reconstructed datatiers. The biggest uncertainty for us is in data we don't usually have a sense of the expected size of the datasets (or rate, i.e. TB/week, etc). The metadata listed above actually are already encoded in the dataset names (year, datatier), and whether HI or pp in the LFN.

ericvaandering commented 1 year ago

@beer4duke

beer4duke commented 1 year ago

At CERN, we started grouping MC to a dedicated tape family and data by year starting with 2022 (previous data was grouped by run). We initially considered grouping by ERA but this would have created too many tape families and would have allocated too many writable tapes. Indeed each tape family needs a rather large amount of tapes to reach the nominal tape write throughput we must provide to experiments. Operationally we are already starting to see some benefits to this logical separation: we can already better track MC rates and file properties, separate older data cleanup and the corresponding repack write flow from the current 2022 data taking. For now tape families are statically configured at the directory level (just like it was done in CASTOR): more metadata would give us more hints to understand how data is logically recalled, understand how badly experiment files are spread on our current tapes and potentially refine data placement under heavy bandwidth constraint.

ericvaandering commented 1 year ago

Rules can have metadata attached to them at creation, so this is something which WMAgent could do. Still figuring out the exact form. Attaching the right metadata would help the PIC and FNAL situations, I guess.

Now how to translate that through the FTS and further down layers, if needed, I'm not sure of.

Also it appears to me that rules made by subscription (like we do for miniAOD) do not have this ability.

jhonatanamado commented 1 year ago

The Tier0 agent can use the meta key to give that information to the SiteAdmins. Tier0 creates the rules to Tape with the following arguments

{'ask_approval': False/True, #Depends on the config of the RSE
'activity': 'T0 Tape', 
'account': 'tier0_prod', 
'grouping': 'ALL', 
'comment': 'T0 WMAgent automatic container rule', 
'priority': 4, 
'meta': '{"agentHost": "vocms015.cern.ch", "userAgent": "WMAgent"}'
}

So I believe we need to define which other values needs to be added. So far that values comes from meta, the basic rule args and finally created the replication rule here

beer4duke commented 1 year ago

The real issue with archive metadata is that they have to be somehow translated at the protocol level to be passed along the data transfer stream(s): archiving is not a pure metadata operations like staging but a transfer with not much metadata.

jhonatanamado commented 1 year ago

Hi guys. So coming back to this issue. What about if the container rule has the following metadata.

'meta': '{
           "agentHost": "agenthost", 
           "userAgent": "WMAgent",
           "lfnBase:"/store/data/acquisition_era/primary-dataset/data_tier/processing_version",
           "size": "growing"
           }'

lfnBase will follow the conventions from LFN Namespace. Then once this information is available at container lever rule it could be possible to give the lfnBase as additional information to the FTSJobs.

ericvaandering commented 1 year ago

Relevant issue in Rucio is: https://github.com/rucio/rucio/issues/2706

ericvaandering commented 10 months ago

That issue has been closed with no meaningful action, as far as I can tell.

ericvaandering commented 8 months ago

In discussions with Martin, we're going to get Maggie working on this as a first major project including the core Rucio portion

dynamic-entropy commented 8 months ago

I got a different vibe from Dimitrios. Who gave the impression that until the tapes show an improvement, they will not put any effort into this. They have the following patch (buggy for non-deterministic rses), https://gitlab.cern.ch/atlas-adc-ddm/flux-rucio/-/blob/master/releases/production/common-includes/tape_metadata.patch?ref_type=heads and is enabled at KIT. By the way, who is Maggie?

ericvaandering commented 7 months ago

Supplanted by https://github.com/rucio/rucio/issues/6398