Closed elisal closed 9 years ago
Hi Elisa,
NoLFC should not be and argument but an switch.
Ricardo
Hi, you are right. Now it's a switch. https://github.com/elisal/DIRAC/blob/removeUnregisteredRepl/Interfaces/scripts/dirac-dms-remove-lfn-replica.py cheers
Hi Elisa,
I think that in both cases (with or without the switch) you should first try to remove using the RM.removeReplica method (or the dirac API equivalent). If the NoLFC option is set, and the removal has failed with 'No such file or directory', then you have to use the RM.getPfnForLfn and then RM.removeStorageFile.
The logic you have implemented allows to remove a replica from the Storage leaving the Replica info in the LFC. This is clearly not what we want.
Have we agreed to keep the Dirac or the ReplicaManager based dms scripts?
Sorry, was too fast:
ok, if you prefer I will change the script calling first ReplicaManager.removeReplica() , and then consider the switch. However, with the current logic no inconsistency can be generated, with or without the NoLFC switch. BTW, another possibility is to merge the 2 DMS 'removal' scripts (dirac-dms-remove-lfn-replica and dirac-dms-remove-replicas) as said in a mail thread some time ago. The idea was to enhance the Dirac API to be able to remove replicas from multiple SEs (and now also for removing a replica from storage, to cover also the use case of removal of replicas not registered), and then get rid the script directly based on ReplicaManager.
The logic in your code, with the "NoLFC" will remove the replica from the Storage without checking/removing anything from the LFC. So if the Replica is properly register in LFC, it file will be removed from the storage, but not from LFC. The inconsistency is only avoid if you first run the script without the switch and then again with the switch (but this can not be our working assumption).
We need to be consistent. If for the moment you want not to touch the Dirac API, the new script has to start from the one using the ReplicaManager and add the extra code needed. If you want to start from the script based on the Dirac API, then we should add a new method to the API that does what we have describe, try to remove first as if the replica is properly registered and then go directly to the Storage and remove again.
Hi Ricardo, please see the example below (the same I posted already before)
[hpdesk] > dirac-dms-lfn-replicas /lhcb/user/l/lanciott/apiEx.py {'Failed': {}, 'Successful': {'/lhcb/user/l/lanciott/apiEx.py': {'CERN-USER': 'srm://srm-lhcb.cern.ch/castor/cern.ch/grid/lhcb/user/l/lanciott/apiEx.py', 'RAL-USER': 'srm://srm-lhcb.gridpp.rl.ac.uk/castor/ads.rl.ac.uk/prod/lhcb/user/l/lanciott/apiEx.py'}}} [hpdesk] > dirac-dms-remove-lfn-replica /lhcb/user/l/lanciott/apiEx.py CERN-USER --NoLFC WARNING: removing physical replica from storage, without removing entry in the FC WARNING: file is registered in FC! it will NOT be removed from storage! {'OK': True, 'Value': {'Successful': {'/lhcb/user/l/lanciott/apiEx.py': True}, 'Failed': {}}} [hpdesk] > dirac-dms-lfn-replicas /lhcb/user/l/lanciott/apiEx.py {'Failed': {}, 'Successful': {'/lhcb/user/l/lanciott/apiEx.py': {'CERN-USER': 'srm://srm-lhcb.cern.ch/castor/cern.ch/grid/lhcb/user/l/lanciott/apiEx.py', 'RAL-USER': 'srm://srm-lhcb.gridpp.rl.ac.uk/castor/ads.rl.ac.uk/prod/lhcb/user/l/lanciott/apiEx.py'}}}
the script checks if the replica is registered , and in case it is, it does not remove the file Why do you say that it removes the replica without checking/removing anything from LFC?
Sorry, I had missed: res = rm.getReplicaIsFile( lfn, seName ) if res['OK']: print 'WARNING: file is registered in FC! it will NOT be removed from storage! ', res continue
you are right.
Still, this means that if you have a bunch of files that are "problematic" you have to run the command twice to solve the situation. I think that the option "DoNotTrustFC" should allow to remove the Replica(s) with a single command no matter if they are or they are not registered in the FC (not LFC).
might be we need another flag to remove a replica from Storage only if not registered in the FC.
Chris, I assign this to you momentarily. Please close it if you think it's done.
All this (and more!) is already implemented in the DMScript of LHCbDIRAC. There has been plenty of discussions about porting at least part of it into DIRAC if I understood correctly. Anyway, most of what is said here is (or will be very soon) obsolete, so I think it can be closed, but I don't have the karma for it.
Closing it, moving to DIRAC can be a good idea.
Hi, I modified the dirac-dms-remove-lfn-replica script, adding an option: FCCheck. By def. it is YES, so it checks the replica's existence in the LFC, and behaves exactly like the current version. If the option is set to NOLFC, then it does: -checks in any case if the replica is registered, if yes it doesn't remove anything -if the replica is NOT registered, then it calls: StorageElement.getPfnForLfn() to get the surl, and then it calls ReplicaManager.removeStorageFile( surl, seName )
Examples: [lxplus423] > srmls srm://srm-lhcb.gridpp.rl.ac.uk/castor/ads.rl.ac.uk/prod/lhcb/test/roberto/temp/SARA_5.13778 5908 /castor/ads.rl.ac.uk/prod/lhcb/test/roberto/temp/SARA_5.13778 this file exists on storage and it is not registered in LFC:
then the script with the NOLFC option will remove it: [hpdesk] > dirac-dms-remove-lfn-replica /lhcb/test/roberto/temp/SARA_5.13778 RAL-DST NOLFC WARNING: removing physical replica from storage, without removing entry in the FC ReplicaManager.executeReplicaStorageElementOperation: Failed to get replicas for file. /lhcb/test/roberto/temp/SARA_5.13778 No such file or directory ReplicaManager._executeStorageElementFunction: No pfns supplied. ReplicaManager.executeReplicaStorageElementOperation: Failed to execute isFile StorageElement operation. ReplicaManager._executeStorageElementFunction: No pfns supplied. Summary: Successfully removed: ['srm://srm-lhcb.gridpp.rl.ac.uk/castor/ads.rl.ac.uk/prod/lhcb/test/roberto/temp/SARA_5.13778'] Failed to remove: []
and in fact the file has been removed:
on the other side, if I execute the script with the NOLFC option for a replica that IS REGISTERED in LFC, the script will refuse to remove it. E.g.
I try to remove it with the NOLFC option but the script says no, it can't be removed only from storage: [hpdesk] /home/elisal/dev > dirac-dms-remove-lfn-replica /lhcb/user/l/lanciott/apiEx.py CERN-USER NOLFC WARNING: removing physical replica from storage, without removing entry in the FC WARNING: file is registered in FC! it will NOT be removed from storage! {'OK': True, 'Value': {'Successful': {'/lhcb/user/l/lanciott/apiEx.py': True}, 'Failed': {}}} Summary: Successfully removed: [] Failed to remove: []
and in fact the replica of CERN-USER is still there :
I think the functionality is fine. I will do some more tests and then if there are no objections I will issue a pull request. code is here: https://github.com/elisal/DIRAC/blob/removeUnregisteredRepl/Interfaces/scripts/dirac-dms-remove-lfn-replica.py cheers