Closed gabrielefronze closed 2 years ago
Just a small note as I've been working with LIGO to utilize OSG and EGI-based computing centers for multiple years now. While "proprietary HPC clusters" characterizes at least one LIGO resource, there's actually a wide range of resources available.
Regardless, that's neither here nor there. The rest of the post looks good!
Hi Brian,
Thanks for the head up. Indeed I was just intending to highlight the differences between our two computing infrastructures, not to be "detailed"! :) Cheers!
Gabriele
Hi guys, fascinating discussion, this is something that we have mulled in @EGI-Foundation for a bit as well. It would be out of place to make sweeping statements about DIRAC without the developers involved, so maybe this issue could be pointed out to them if that hasn't already been done.
My 2c is that there there are two patterns right now in developing these platforms
We have often looked at DIRAC as an HTC solution, but it's way more than that and just using it as an HTC solution is actually quite hard. I hazard to say that it works best when it's the primary interface for users and applications.
Rucio on the other hand is (forgive me for projecting my own perception here), a fantastic data management system. It could (is?) tack on compute management as well. As a product, we (say, EGI), would like it to interoperate with other services like cloud compute, HTC, HPC, etc, via stable APIs and do it's data management thing.
It would be nice to know if Rucio could be used as a drop-in replacement data catalogue for DIRAC, and more interesting to know if DIRAC could be used as a drop-in compute orchestration service for Rucio. My personal feeling is that something that does compute orchestration only would be a better fit (maybe, HTCondor, I don't have a great answer here, sorry).
Thanks! (usual disclaimer of "these opinions are mine and mine alone", "this does not represent the position of EGI, EGI Foundation etc" apply here :wink: )
Hi Bruce,
I am pointing some DIRAC people to this issue! Cheers,
Gabriele
Hi, I am DIRAC technical coordinator, and right now its main developer. I've been pointed here, I will try to give some advice.
As mentioned above DIRAC give you the possibility to work with different, and even multiple, catalogs. Just to mention some real-life use case, which are the ones working best:
The DFC, the LFC, AMGA, the LHCb Bookkeeping are all "Catalogs". In DIRAC terminology, in fact, they are all Catalog plug-ins, as a DIRAC Catalog is such if it implements the same interface (e.g. add file, remove, etc). All catalogs implement the same interface and inherit from https://github.com/DIRACGrid/DIRAC/blob/integration/Resources/Catalog/FileCatalogClientBase.py
You can have more than 1 catalog at the same time, as obvious from the examples above. In this case, the operations will be executed on all of them. So, for example you can register files on BOTH the LFC and DFC at the same time.
Basically, each catalog plug-in implements a certain operation (e.g. the addFile
operation) following its own "interpretation" of what, e.g. adding a file means for a certain catalog.
So, what may be interesting for you, is implementing a RucioCatalogClient.py. The rest is purely configuration.
Hi @fstagni,
thank you for joining the conversation. Indeed that was the solution thought about at first, but I have some questions to ask:
FileCatalogClientBase.py
custom derived interface?Thank you,
Gabriele
READ_METHODS = [ 'hasAccess', 'exists', 'getPathPermissions' ]
. It seems to me (and makes totally sense) that DIRAC requires to be able to read data, but the persistent output of data is not strictly required for the computing functionalities.In addition I would like to ask you if there is any example of a custom implementation of FileCatalogClientBase.py
(e.g. for LFC) to read and understand more the integration process.
Thank you
Gabriele
FileCatalogClientRucio.py
implementation it should be enough to use direct calls instead of a custom DIRAC plugin.RucioFileCatalogClient.py
file should be the only one of the whole DIRAC where you do import rucio
.A Small update on this ticket. A RucioFileCatalogClient
is now available in Dirac. It was merged into v7r0:
https://github.com/DIRACGrid/DIRAC/pull/5067
The patch also contains a RucioSynchronizerAgent
and a RucioRSSAgent
that are used to synchronize Dirac Configuration Service and Rucio. What is currently missing is to write and setup the tests to validate it. The setup of the tests will require to create a Rucio instance that can be used in Github Actions. Any help is welcome. If anyone is interested to work on this, please get in touch with me.
Caveat: The implementation of RFC is based on the Belle II one and although we tried to be collaboration agnostic, there might be a few things to change for the other communities.
I am closing this ticket now, it's largely an overview ticket anyway. Specific changes wich still need to be addressed are in the issue tracker in Rucio under the DIRAC label or on the DIRAC tracker.
The LIGO and Virgo communities are stepping into Rucio to manage ste storage elements of their infrastructures. While LIGO is provided of proprietary HPC clusters, Virgo isn't and instead relies on a set of academic computing centers. The characteristics of the latter, as well as some previous choices, strongly point to a wide adoption of DIRAC in Virgo, while the interoperability with LIGO requires Rucio as storage manager/orchestrator.
After some discussion we (maybe) found a nice solution. Instead of developing a DIRAC plugin able to interface it to Rucio in a POSIX-like manner, we think the best option is to create a "DIRAC-mode" for the Rucio catalog. This argument follows a comment of many people: DIRAC was born to create a uniform interface to different GRID implementations, but ended up managing both Computing Elements (CE) and Storage Elements (SE). This choice was taken to make DIRAC aware of the geographical data position in order to minimize the data transfers. Since DIRAC is used by a rather small community not a lot of organizations might be interested in stepping in the development of an integration. Instead Rucio is much more appealing and finding a way to keep in the Rucio's external catalog enough topological information to make DIRAC happy and efficient might be the way.
In addition we discovered that DIRAC allows to pass it a custom catalog and some Virgo people has already performed some tests in that sense, creating an LFC catalog dump and running a DIRAC instance on that data. In fact such solution might scale up to decoupling the DIRAC storage function from the computing function, giving lots of benefits even to the DIRAC product. This could bring in some of their developers to assist in the process. Worth mentioning that DIRAC jobs can be any kind of executable, from an
sh
script to an executable available cluster(s)-wide (e.g. firefox...). Since the read from Rucio-managed files should be supported OOTB by DIRAC, the need of a plugin is needed only to register on Rucio the output files of the jobs. However, since DIRAC can run basic scripts, at first the registration of output files on Rucio might be handled by by-hand calls to the Rucio API from within the DIRAC job. If we think about a C++ compiled executable which produces amyfile.root
file running on a shellmyexe myfile.root
. Wrapping the same in something likemyexe myfile.root && <rucio_API_call> myfile.root
should do the basic tricks.