Closed kreczko closed 8 years ago
Hi Lukas,
one comment: AFAICT, you will need to update the parse function in perl_lib/PHEDEX/Namespace/gfal/stat.pm because it expects 'srmls' output, but the gfal-ls output is more similar to lcg-ls output. You can use the srmv2lcg namespace as an example of parsing lcg-ls:
https://github.com/dmwm/PHEDEX/blob/master/perl_lib/PHEDEX/Namespace/srmv2lcg/stat.pm
I'll wait for updated commits and test results to merge.
Thanks! Nicolo'
Hi Nicolo,
Thanks for the hint. I got setool
now set up and able to test the functionality.
There are two problems at the moment:
srmls -l
gives you all information, gfal-ls
does not. In order to produce the same information as srmls -l
I also need to call gfal-sum <pfn> adler32
.I do not know how the latter would be possible due to the way the Namespaces work. sub Command
calls ns->cmd ns->opt <pfn>
(which is wrong for gfal-sum
) and has only the capability for one command. I assume it is easy enough to overwrite it in gfal/Common.pm and add two parse function, one for each gfal command.
Does this sound about right?
The parsing is now implemented and passes my tests with setool
.
I am going to deploy it now in our phedex instance and see if the BDV works.
EDIT: Still needs the mtime
parsed, brb.
The output of setool
now looks like
./Utilities/setool --pfnlist mypfnlist gfal stat
gsiftp://lcgse01.phy.bris.ac.uk/dpm/phy.bris.ac.uk/home/cms/store/mc/RunIISpring15DR74/GluGluToRadionToHHTo2B2VTo2L2Nu_M-800_narrow_13TeV-madgraph/MINIAODSIM/Asympt25ns_MCRUN2_74_V9-v1/00000/A8E137A4-9F18-E511-AD88-3417EBE535DA.root stat = [{'STDOUT' => ['-rw-rw-r-- 0 0 0 Jun 23 14:13 35440708 gsiftp://lcgse01.phy.bris.ac.uk/dpm/phy.bris.ac.uk/home/cms/store/mc/RunIISpring15DR74/GluGluToRadionToHHTo2B2VTo2L2Nu_M-800_narrow_13TeV-madgraph/MINIAODSIM/Asympt25ns_MCRUN2_74_V9-v1/00000/A8E137A4-9F18-E511-AD88-3417EBE535DA.root '],'checksum_type' => 'adler32','uid' => '0','mtime' => 1437657180,'access' => 'rw-rw-r--','lifetime_left' => '-1','size' => '35440708','STDOUT_CKSUM' => ['gsiftp://lcgse01.phy.bris.ac.uk/dpm/phy.bris.ac.uk/home/cms/store/mc/RunIISpring15DR74/GluGluToRadionToHHTo2B2VTo2L2Nu_M-800_narrow_13TeV-madgraph/MINIAODSIM/Asympt25ns_MCRUN2_74_V9-v1/00000/A8E137A4-9F18-E511-AD88-3417EBE535DA.root 06c5aff5'],'checksum_value' => '06c5aff5','space_token' => '','type' => 'FILE','gid' => '0'}]
and works also with the srm protocol:
./Utilities/setool --pfnlist testfiles gfal stat
srm://cmssrm-kit.gridka.de:8443/srm/managerv2?SFN=/pnfs/gridka.de/cms/disk-only/store/data/Run2015C_25ns/TOTEM_minBias1/RECO/05Oct2015-v1/40000/78552A3A-5875-E511-AE76-001E67E6F8D2.root stat = [{'STDOUT' => ['-rw-r--r-- 1 2 2 Oct 18 06:37 4182945455 srm://cmssrm-kit.gridka.de:8443/srm/managerv2?SFN=/pnfs/gridka.de/cms/disk-only/store/data/Run2015C_25ns/TOTEM_minBias1/RECO/05Oct2015-v1/40000/78552A3A-5875-E511-AE76-001E67E6F8D2.root ONLINE'],'checksum_type' => 'adler32','uid' => '2','mtime' => 1447828620,'access' => 'rw-r--r--','lifetime_left' => '-1','locality' => 'ONLINE','size' => '4182945455','STDOUT_CKSUM' => ['srm://cmssrm-kit.gridka.de:8443/srm/managerv2?SFN=/pnfs/gridka.de/cms/disk-only/store/data/Run2015C_25ns/TOTEM_minBias1/RECO/05Oct2015-v1/40000/78552A3A-5875-E511-AE76-001E67E6F8D2.root 3949989d'],'checksum_value' => '3949989d','type' => 'FILE','space_token' => '','gid' => '2'}]
Obviously a different set of information is available using srm
with srmls
compared to gfal
, but I hope that the most important ones (size, checksum, checksum type) are OK.
While the output looks fine to me, the new namespace still fails for BDV. In the Phedex config I use
### AGENT LABEL=download-verify PROGRAM=Toolkit/Verify/BlockDownloadVerify ENVIRON=glite
-db ${PHEDEX_DBPARAM}
-namespace gfal
-nodes ${PHEDEX_NODE}
-protocol 'srmv2'
but the logs show
2015-10-22 16:39:54: BlockDownloadVerify[49103]: Request=102892726, state=Active
Perform the following checks:
SIZE : yes
gfal-ls: error: Protocol not supported
which probably happens when the gsiftp protocol is used with the xattr option. However, line https://github.com/dmwm/PHEDEX/pull/1010/files#diff-e300809a18186d02df177aa0a24463fdR122 is meant to prevent exactly that.
Any ideas?
I have removed the '--xattr user.status'
option for now and the BDV is now working:
https://cmsweb.cern.ch/phedex/prod/Data::Verify?test=Any&node=.*Bristol&block=.*&status=Any&.submit=Update#
Hi Lukas,
sorry for the late reply... two comments about your questions:
1) If you need to run a different command to get the checksum, you should actually split out the checksum command to a separate module instead of overwriting Command.pm
Remove checksum_type/checksum_value from the fields of the stat module, and create a new module providing those fields. The Namespace framework will take care of automatically loading the correct module for the fields you request in the test.
As an example, you can take the checksum module for the posix filesystem:
https://github.com/dmwm/PHEDEX/blob/master/perl_lib/PHEDEX/Namespace/posix/checksum.pm
2) Concerning the --xattr user.status
option issue, AFAIK you simply need to check $pfn =~ 'gsiftp://'
instead of $file =~ 'gsiftp://'
in
https://github.com/dmwm/PHEDEX/pull/1010/files#diff-e300809a18186d02df177aa0a24463fdR122
Could you please try to apply these changes and update the pull request?
Thanks! N.
Hi Lukas,
I'll merge your pull request and take care of fixing the additional issues reported in https://github.com/dmwm/PHEDEX/pull/1010#issuecomment-152584290
Cheers N.
This pull request implements the gfal namespace for phedex to solve the issue discussed in https://hypernews.cern.ch/HyperNews/CMS/get/phedex/2618.html
The test of this implementation is still outstanding and planned for tomorrow. The implementation is a copy of the SRM namespace with unnecessary lines removed and commands altered. If you see any obvious problems with the implementation, please comments.