dmwm / CMSRucio

7 stars 31 forks source link

Determine which information CMS should send to CTA/tape systems for grouping decisions #753

Closed ericvaandering closed 1 month ago

ericvaandering commented 6 months ago

This first issue is a requirements issue and certainly needs a discussion. There are some ideas in #323 but now we have technical constraints too, which is that we are supposed to supply 4 levels information to be used to make grouping decisions.

https://indico.cern.ch/event/1353243/contributions/5812010/ is worth taking a look at too.

ericvaandering commented 6 months ago

Seems to me level 0 is likely data+era and MC+year. Level 1 may be data tier.

@DAMason @drkovalskyi

drkovalskyi commented 6 months ago

As far as I know CTA at CERN uses special tape families for data only, grouping everything by year. FNAL gets less data, so I doubt that you need era information, plus some eras are very small. For MC it's very hard to say something sensible. The data tier is probably the most relevant input. For example: AODSIM is mostly not needed, but when we need it, it's likely needed in bulk.

ericvaandering commented 6 months ago

Let's not worry about what particular sites do. I assume Fermilab will still continue to create file families/storage classes in CTA and probably make the first level or two of data redundant with that. That's fine. The idea is that eventually CTA will use this information, especially the finer bits of it not just in which tapes some data is written to but how things are laid out on the tape. Writing data such that even if there are 10 datasets on a tape that all 10 are grouped together would be a large benefit for the read back.

So we need to figure out our 4-levels even if the top ones are not useful to site which use file families (storage classes is the CTA term).

drkovalskyi commented 6 months ago

In this case, I would make level zero: Data or MC. The I would focus on the data tier for all types of data (level one). Any mass recall for reprocessing requires a specific data tier. Then, as level 3, for MC the most important part is the production campaign and for Data its era, which may also include re-reco "campaign", but it's not systematic enough to rely on. I don't see anything worth considering for level 4.

voetberg commented 6 months ago

We can include campaign when its present level for 4, but otherwise not include a level 4. Because its the lowest level, it wouldn't interfere with other levels, but would be beneficial when it is used.

ericvaandering commented 6 months ago

In this case, I would make level zero: Data or MC. The I would focus on the data tier for all types of data (level one). Any mass recall for reprocessing requires a specific data tier. Then, as level 3, for MC the most important part is the production campaign and for Data its era, which may also include re-reco "campaign", but it's not systematic enough to rely on. I don't see anything worth considering for level 4.

I think you are missing the idea of file on tape positions which is where the lower levels would really come and could potentially make data access much faster and easier on the physical tape media. Take a look at https://indico.cern.ch/event/1353243/contributions/5812010/ especially slides 6 and 13.

But I think it's pretty clear we need to have a chat and/or an O&C presentation before finalizing this. That doesn't mean the development can't start because we need to have a way to pull this information out of the DIDs before it can be put into these fields.

drkovalskyi commented 6 months ago

I have looked again at Julien's slide and I'm not sure what I'm missing. We can use the Tue time slot next week to have a chat if you want to.

ericvaandering commented 6 months ago

That would be great. I think. Assuming I am back from the eclipse. Do you want to make an indico dedicated to this? Sent from a mobile device. Please excuse my brevity or transcription errors.

On Apr 4, 2024, at 2:08 PM, Dmytro Kovalskyi @.***> wrote:



[EXTERNAL] – This message is from an external sender

I have looked again at Julien's slide and I'm not sure what I'm missing. We can use the Tue time slot next week to have a chat if you want to.

— Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_dmwm_CMSRucio_issues_753-23issuecomment-2D2038005128&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=EHaoB-POFWGrYFvPXoj1bQ&m=wCGdAqsP__sc_4vijwFMAf0zlCJ6UX_dzhBQo64J18Oe0lddKIxJ2tQJm6Z79ga-&s=L02isrGmA-ggoyvteMrEH0oZpwUweZsoLo0GgG3RUJQ&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AAMYJLUHWQ5FHBTVCLNAL6LY3WQLZAVCNFSM6AAAAABFUC5SXSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZYGAYDKMJSHA&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=EHaoB-POFWGrYFvPXoj1bQ&m=wCGdAqsP__sc_4vijwFMAf0zlCJ6UX_dzhBQo64J18Oe0lddKIxJ2tQJm6Z79ga-&s=xCeDypeB7Bc1F4mumWkA797-Us8YR9Y6bOBpz9I5z50&e=. You are receiving this because you authored the thread.Message ID: @.***>

drkovalskyi commented 6 months ago

We can also find another time slot. Let's try to do it before CMS week. Wishing you good weather. The eclipse is an exceptional event.

voetberg commented 5 months ago

Hi, was there ever any movement on this? I have the basic outline of the plugin done, but I'm not sure where this discussion ended up.

ericvaandering commented 5 months ago

We have not. We should have a discussion though.

klannon commented 1 month ago

@ericvaandering Issue is marked as done in board but not closed. Can you clarify? Is the information that was determined documented somewhere? (Last comment was from April 29 saying discussion is needed.)

ericvaandering commented 1 month ago

@klannon thanks for catching this. We had the discussion and made the decision, so now it's implementation and @voetberg says it's already pretty much done.