Changes needed to WMCore code to allow Rucio to manage DQM Harvested data

fioriNTU commented 4 years ago

Impact of the new feature I let more "technical" people to add information here. This new feature should allow to manage DQM Harvested files with the Rucio system. Examples of such files are available here: https://cmsweb.cern.ch/dqm/offline/data/browse/ROOT/OfflineData/Run2018/ZeroBias/0003239xx/ Files are currently in local VM storage (see next), while we would like to have them available in Rucio for an easier access

Is your feature request related to a problem? Please describe. Currently Harvested DQM files (i.e files used by the DQM GUI), are handled in a way that often make difficult to access the files few months after they have been created. Having those files managed by Rucio will make a lot easier to store and share them.

Describe the solution you'd like Harvested DQM files are currently saved directly in the DQM GUI servers, as an example in vocms0738 in the following folder /data/srv-prod/state/dqmgui/offline/data/OfflineData/ a fraction of DQM files after harvesting is kept temporary. Later the files are copied to Castor tape, making quite difficult to access them again. We would like to have a different way to manage these files, and talking with Computing, it looks like Rucio is the right solution, but it requires some changes on the current WMCore code to go ahead with the project.

Describe alternatives you've considered For the time being no alternative have been considered.

Additional context This feature request is here as a reminder for WMCore developers about the need of some changes in the base code to allow the migration of DQM Harvested files to Rucio storage. In case more details are needed, please contact cms-dqm-coreTeam@cern.ch

amaltaro commented 4 years ago

Hi @fioriNTU, thanks for creating this issue. We are currently dealing with some issues in production, as soon as those are under control; we will get back to this issue and start discussing this new feature. Perhaps you could also link the pdf presentation when this got proposed within CMS? Thanks

schneiml commented 4 years ago

Hi Alan,

our attention was brought to Rucio after this talk [1]. We are now working towards implementing the "Compromise solution" from page 11, of managing the legacy ROOT files with Rucio.

[1] https://indico.cern.ch/event/908846/#6-a-new-datatier-to-save-dqmio

Thanks,

Marcel

fioriNTU commented 4 years ago

Hi @amaltaro , any news on this?

amaltaro commented 4 years ago

Hi Marcel, Francesco, sorry for the belated reply. I think it's important to first provide you with a summary of the DQM harvesting functionality implemented in WMCore and how it works.

It can be enabled for standard workflows (ReReco/StepChain/TaskChain) as an extra/final step; or it can be created in a standalone DQMHarvest workflow type
we support two modes of harvesting: byRun, where at least 1 job gets created for each run data available (per location); or multiRun mode, where the target is to create a single job to process all runs available at the same location.

When the harvesting step is enabled in a standard workflow, we need to make sure that all the DQM/DQMIO files are available in the same storage, to ensure that a single job per run is created; such that we can avoid multiple versions of the same run (with incomplete statistics) to be processed and uploaded to the DQMGui.

That being said, how would we handle - possible - multiple versions of the same root file (for the same run(s)) in this new dataset/datatier in the Data Management bookkeeping?

Another question I have is, would we want to keep uploading these root files to the DQMGui (in addition to the new implementation which would stage those root files out to the site storage)?

I also wonder whether we would have all the provenance and meta-data information to be injected in the DM system (like parentage, runs, lumis, events, etc)? Not sure how important all of that is at the moment though (events we do not have for DQMIO for instance). Thanks

schneiml commented 4 years ago

Hi Alan,

Regarding the modes of harvesting: I think this should not really affect this proposal (or does it?). In the long term, it would be nice to get rid of all of this "magic" logic in WMCore and have Harvesting jobs as normal, "dataset in, dataset out" jobs, but that is not the goal here for now.

Re the at least: I Assume this is because data can be harvested multiple times, as more jobs finish? In the end, there should be exactly one, final jobs that processes all the data and its result supersedes all the earlier ones.

Re the versioning: Ideally we would replicate exactly the logic that we have today in DQMGUI on the Computing side: When a new file with same dataset/run appears, it's version number in the file name is increased to one higher than the latest existing file, and it is added to the storage. I'd assumed that with Rucio we can keep the file names as they are today, which would make migration easy; but anything with similar semantics (append new files, provide a way to determine which file is newest) is fine. What is important is that we never replace existing files (the contents of a file with a given name shall never change). We might delete older versions, though.

Re the GUI uploads: For now, we'd like to have both the uploads and the new dataset at the same time, to allow for some validation and not require a hard switch-over. Once all the dataset machinery works and we have the new GUI running well (this might well take a year), the uploads can be stopped. Instead, we might want to receive a HTTP callback to announce new files, to avoid having to poll for new data constantly; but that can be on a best-effort basis and fall back to polling once per day.

Re metadata: I don't really know much about that. It would sure be nice to have, but all the data we need (and have, today) is encoded in the names of the uploaded files: (input/DQMIO) dataset name and run number. Having more metadata is always nice, but we don't absolutely need it for now.

amaltaro commented 4 years ago

Re the at least: I Assume this is because data can be harvested multiple times, as more jobs finish? In the end, there should be exactly one, final jobs that processes all the data and its result supersedes all the earlier ones.

Ony T0 does periodic harvesting. Central production only harvest files once all the input data has been made available from the previous tasks/steps.

About the number of jobs, we cannot guarantee that there will be a single job to process the whole dataset (or a given run of it). The reason is that Harvest jobs are data driven - by design, as any other job that requires input data, they go where the data is available, and if the previous steps/tasks actually ran in different sites, then the system would create multiple jobs, one for each location. I know that this has undesirable behaviour, still I'd rather discuss it in a different place/issue.

Re the versioning: Ideally we would replicate exactly the logic that we have today in DQMGUI on the Computing side: When a new file with same dataset/run appears, it's version number in the file name is increased to one higher than the latest existing file, and it is added to the storage. I'd assumed that with Rucio we can keep the file names as they are today, which would make migration easy; but anything with similar semantics (append new files, provide a way to determine which file is newest) is fine. What is important is that we never replace existing files (the contents of a file with a given name shall never change). We might delete older versions, though.

This is something complicated, as previously mentioned, a workflow can spawn multiple harvesting jobs (and even run in multiple agents, even though CompOps makes sure it does not happen). The file name convention will likely have to change as well, in agreement with one of the LFN lexicons: https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/Lexicon.py#L344 File replacement is something we can ensure it won't happen, files should have a GUID.

Re the GUI uploads: For now, we'd like to have both the uploads and the new dataset at the same time, to allow for some validation and not require a hard switch-over. Once all the dataset machinery works and we have the new GUI running well (this might well take a year), the uploads can be stopped.

Okay!

Instead, we might want to receive a HTTP callback to announce new files, to avoid having to poll for new data constantly; but that can be on a best-effort basis and fall back to polling once per day.

the notification of new harvested data is better to be addressed in a future issue IMO.

schneiml commented 4 years ago

I know that this has undesirable behaviour, still I'd rather discuss it in a different place/issue.

Ok, thanks for the clarification -- Such behaviour will most likely lead to incorrect/misleading DQM output but it seems to be not an issue in practice so far.

The file name convention will likely have to change as well, in agreement with one of the LFN lexicons:

That might be fine, or might be a problem, not sure.

In general changing the versioning is perfectly ok. We need some way to find the latest version, but I assume there will be a timestamp somewhere, so even just having a GUID instead of the version is fine.

However, the filename (traditionally) is the only source of metadata for these files, so the components that are there today need to be preserved somehow. Adding a prefix/suffix is fine though.

Looking a bit more into details, the file actually does know its run number in the content, and files containing more than one run are not entirely impossible on the format level (though nothing really could handle that today). So that leaves only the version (as discussed before) and then the (input/DQMIO!) dataset name. I think we could get this from somewhere else as well, but things get more complicated in that case, also for "general" users (there are a lot of scripts by the subsystems that currently pull these files from the DQMGUI, and we'd really like them to get them straight from the dataset instead, once we have that).

the notification of new harvested data is better to be addressed in a future issue IMO.

Agreed, let's first get the files saved somewhere...

fioriNTU commented 4 years ago

@amaltaro do you have any news about this? Thank you!

fioriNTU commented 4 years ago

@amaltaro do you have any news about this? Thank you!

klannon commented 4 years ago

Hello @fioriNTU. I'm sorry about our slow responses here. The WM development team is under intense pressure to complete the necessary changes that will allow us to retire PhEDEx. These must be completed in six weeks to keep our schedule. Unfortunately, this means any projects that are not associated with that specific deadline have been put on pause. We'll come back to this in September. I hope this delay is not too painful for you.

fioriNTU commented 4 years ago

@klannon , thank you for the explanation, of course we understand and can wait some more. However this feature is becoming more and more and more important for DQM related developments. So, please, ping us as soon as there are news about it. Thank you!

jfernan2 commented 3 years ago

@klannon sorry to insist: any progress on this front? Thank you

andrius-k commented 3 years ago

Hello @klannon, any news on this issue?

dmwm / WMCore

Changes needed to WMCore code to allow Rucio to manage DQM Harvested data #9677