dmwm / WMCore

Core workflow management components for CMS.
Apache License 2.0
46 stars 107 forks source link

srmv2 stageout implementation fails for file paths containing consecutive slashes #4131

Closed pkonst closed 9 years ago

pkonst commented 12 years ago

The srm v2 stageout implementation determines the size of the written file by doing srmls on the remote PFN and then grep for the remote path (https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/Storage/Backends/SRMV2Impl.py#L197). When the PFN and the path contain consecutive slashes, e.g. (*), this does not work because srmls returns the path with just a single slash and the grep does not match. This results in declaring perfectly successful stageouts as failed.

(*) srm://lcgsedc01.jinr.ru:8443/srm/managerv2?SFN=/pnfs/jinr.ru/data/cms/store/temp/user/pkonst/logs///pkonst_crab_crab3test-PK-001_121003_101259/Analysis/0000/0/eb423dac-0d47-11e2-893f-003048f1c5ce-23-0-logArchive.tar.gz

PerilousApricot commented 12 years ago

Which site is this for? It was my understanding everyone had moved to lcg-cp.

pkonst commented 12 years ago

The site is T2_RU_JINR. I have opened several "please, switch to lcg-cp" savannah tickets myself but apparently there are still sites using srmcp. I can do it for this site too, but anyway this looks like a bug in the stageout implementation.

The problem happens for me with CRAB3 analysis jobs. They have log archive LFNs like /store/temp/user/pkonst/logs///pkonst_crab_crab3test-PK-001_121003_101259/Analysis/0000/0/eb423dac-0d47-11e2-893f-003048f1c5ce-36-0-logArchive.tar.gz

cinquo commented 12 years ago

One of the slashes is probably being added by the analysis spec: https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/WMSpec/StdSpecs/Analysis.py#L80 I am not sure about the other one...

hufnagel commented 12 years ago

One of the last lines after parsing the PFN in the CERN stageout plugin is:

        # remove multi-slashes from path
        while ( simpleCastorPath.find('//') > -1 ):
            simpleCastorPath = simpleCastorPath.replace('//','/')

Something like this in the right place should solve this...

cinquo commented 12 years ago

I see, actually it would be good to understand where that comes from. In the while I remove an extra slash with this pull request.

cinquo commented 12 years ago

Running some tests: I see that the triple '/' is not related to srmv2 but gives error only with that, as Preslav said. "srm://bsrm-1.t2.ucsd.edu:8443/srm/v2/server?SFN=/hadoop/cms/phedex/store/temp/user/mcinquil/logs///mcinquil_crab_newtest_02_tuesday_312pre3_121002_132226/Analysis/0000/0/1038a84c-0ec1-11e2-8379-0026b95c499b-17-0-logArchive.tar.gz" I am trying to verify where these are being added...if it doesn't take too long I would prefer to fix it at the source.

amaltaro commented 9 years ago

Seems it got fixed. @ticoann please close