We in Belle II operations have been noticing that many times RMS flags the file in its ReqDB.File table as 'Done' when in fact it is either not replicated ot not registered in FC/LFC.
Here is a concrete example.
A file was supposed to be transferred to SIGNET-TMP-SE. ReqDB.File says Done
LFN='/belle/MC/fab/release-00-07-02/DBxxxxxxxx/MC7/prod00000823/s00/e0000/4S/r00000/ddbar/sub15/mdst_016883_prod00000823_task00016883.root'
query='SELECT * FROM File WHERE LFN="%s"'%LFN
runQuery(query)
2016-12-10 17:31:20 UTC RequestManagement/RequestExecutingAgent/pid_32596/DDM_fromNONE_toSIGNET-TMP-SE_20161210_173005/0/ReplicateAndRegister WARN: unable to schedule /belle/MC/fab/release-00-07-02/DBxxxxxxxx/MC7/prod00000823/s00/e0000/4S/r00000/ddbar/sub15/mdst_016883_prod00000823_task00016883.root for FTS: _getSurlForLFN: Failed to create SRM2 storage for SIGNET-TMP-SE: StorageFactory._getStorageOptions: Failed to get storage status
2016-12-10 17:31:23 UTC RequestManagement/RequestExecutingAgent/SRM2Storage INFO: __putFile: Executing transfer of srm://se.hep.pnnl.gov:8443/srm/v2/server?SFN=/se/belle/TMP/belle/MC/fab/release-00-07-02/DBxxxxxxxx/MC7/prod00000823/s00/e0000/4S/r00000/ddbar/sub15/mdst_016883_prod00000823_task00016883.root to srm://dcache.ijs.si:8443/srm/managerv2?SFN=/pnfs/ijs.si/belle/TMP/belle/MC/fab/release-00-07-02/DBxxxxxxxx/MC7/prod00000823/s00/e0000/4S/r00000/ddbar/sub15/mdst_016883_prod00000823_task00016883.root using 4 streams
Also file was never transferred to the SE
$ srmls srm://dcache.ijs.si:8443/srm/managerv2?SFN=/pnfs/ijs.si/belle/TMP/belle/MC/fab/release-00-07-02/DBxxxxxxxx/MC7/prod00000823/s00/e0000/4S/r00000/ddbar/sub15/mdst_016883_prod00000823_task00016883.root
Sat Dec 10 09:45:07 PST 2016: Return status:
Status code: SRM_FAILURE
Explanation: All ls requests failed in some way or another
SRM_INVALID_PATH File/directory 0 /pnfs/ijs.si/belle/TMP/belle/MC/fab/release-00-07-02/DBxxxxxxxx/MC7/prod00000823/s00/e0000/4S/r00000/ddbar/sub15/mdst_016883_prod00000823_task00016883.root does not exist.
--
There has been a hint that to keep retry mechanism, file status is not changed just yet in the ReplicateAndRegiter plugin code.
A related post on the forum is
Hello,
We in Belle II operations have been noticing that many times RMS flags the file in its ReqDB.File table as 'Done' when in fact it is either not replicated ot not registered in FC/LFC.
Here is a concrete example.
A file was supposed to be transferred to SIGNET-TMP-SE. ReqDB.File says Done
LFN='/belle/MC/fab/release-00-07-02/DBxxxxxxxx/MC7/prod00000823/s00/e0000/4S/r00000/ddbar/sub15/mdst_016883_prod00000823_task00016883.root' query='SELECT * FROM File WHERE LFN="%s"'%LFN runQuery(query)
REA log says error in this file
$ grep /belle/MC/fab/release-00-07-02/DBxxxxxxxx/MC7/prod00000823/s00/e0000/4S/r00000/ddbar/sub15/mdst_016883_prod00000823_task00016883.root /opt/dirac/runit/RequestManagement/RequestExecutingAgent/log/current
2016-12-10 17:31:20 UTC RequestManagement/RequestExecutingAgent/pid_32596/DDM_fromNONE_toSIGNET-TMP-SE_20161210_173005/0/ReplicateAndRegister WARN: unable to schedule /belle/MC/fab/release-00-07-02/DBxxxxxxxx/MC7/prod00000823/s00/e0000/4S/r00000/ddbar/sub15/mdst_016883_prod00000823_task00016883.root for FTS: _getSurlForLFN: Failed to create SRM2 storage for SIGNET-TMP-SE: StorageFactory._getStorageOptions: Failed to get storage status 2016-12-10 17:31:23 UTC RequestManagement/RequestExecutingAgent/SRM2Storage INFO: __putFile: Executing transfer of srm://se.hep.pnnl.gov:8443/srm/v2/server?SFN=/se/belle/TMP/belle/MC/fab/release-00-07-02/DBxxxxxxxx/MC7/prod00000823/s00/e0000/4S/r00000/ddbar/sub15/mdst_016883_prod00000823_task00016883.root to srm://dcache.ijs.si:8443/srm/managerv2?SFN=/pnfs/ijs.si/belle/TMP/belle/MC/fab/release-00-07-02/DBxxxxxxxx/MC7/prod00000823/s00/e0000/4S/r00000/ddbar/sub15/mdst_016883_prod00000823_task00016883.root using 4 streams
Also file was never transferred to the SE
$ srmls srm://dcache.ijs.si:8443/srm/managerv2?SFN=/pnfs/ijs.si/belle/TMP/belle/MC/fab/release-00-07-02/DBxxxxxxxx/MC7/prod00000823/s00/e0000/4S/r00000/ddbar/sub15/mdst_016883_prod00000823_task00016883.root Sat Dec 10 09:45:07 PST 2016: Return status:
--
There has been a hint that to keep retry mechanism, file status is not changed just yet in the ReplicateAndRegiter plugin code. A related post on the forum is
https://groups.google.com/forum/#!topic/diracgrid-forum/JclTwweKp2s
This issue waits your attention.
Thanks, Vikas