dmwm / CRABServer

15 stars 37 forks source link

Publisher_rucio must check to have all files in a block #8491

Closed belforte closed 4 days ago

belforte commented 2 weeks ago

as described in #8376 the publication flag for individual files is set in PostJob up to 30min after the file replica has been added to the Rucio-dataset/DBS-block by RucioTransfer. So when RucioTransfer adds the last file, declares the block closed and correspondently mark all entries in filetransfersdb, some files in the block may still not have the publish flag set. If Publisher_Rucio runs "too soon" it will miss some files and since publshed blocks are closed, it will never be able to add them.

Solution is that Publisher checks the block/dataset content in Rucio and waits untill all needed files are ACQUIRED

Subtasks (tested in https://gitlab.cern.ch/crab3/CRABServer/-/pipelines/7641939 )

belforte commented 4 days ago

I realized that I need the Rucio scope name, in addition to the DBS block name. This is trivial for users, but for groups it would required parsing the LNF, or maybe that's no sufficient. In any case awkward and fragile. I propose to change the tm_dbs_blockname column to store the Rucio dataset name scope:dbs_blockname. @novicecpp what do you think ?

novicecpp commented 4 days ago

Agreed.