opensciencegrid / StashCache

https://opensciencegrid.org/docs/data/stashcache/overview/
Apache License 2.0
1 stars 16 forks source link

StashCP of zero-length files #77

Open djw8605 opened 6 years ago

djw8605 commented 6 years ago

Stashcp stalls on zero-length files. Investigate!

rynge commented 4 years ago

I ran into a 0-byte file problem as well. Turns out it works with --methods=xrootd but not --methods==http:

$ stashcp --debug --methods=http,xrootd '/user/npavlovikj/public/1574462944/00/0D/DRR106903_transposed_report.tsv' /tmp/
2019-11-23T11:42:23+0000 root         INFO     Trying HTTP...
2019-11-23T11:42:23+0000 root         DEBUG    Downloading with HTTP
2019-11-23T11:42:23+0000 root         DEBUG    Loaded caches list from /usr/share/stashcache/caches.json
2019-11-23T11:42:23+0000 root         DEBUG    Trying geoip site of: wlcg-wpad.fnal.gov [131.225.205.134]
2019-11-23T11:42:23+0000 root         DEBUG    Querying for closest cache: http://131.225.205.134/api/v1.0/geo/stashcp/its-condor-xrootd1.syr.edu,sc-cache.chtc.wisc.edu,osg-new-york-stashcache.nrp.internet2.edu,osg-kansas-city-stashcache.nrp.internet2.edu,osg-chicago-stashcache.nrp.internet2.edu,fiona.uvalight.net,stashcache.t2.ucsd.edu,osg-gftp.pace.gatech.edu
2019-11-23T11:42:23+0000 root         DEBUG    Got OK code 200 from wlcg-wpad.fnal.gov
2019-11-23T11:42:23+0000 root         DEBUG    Got order ['5', '2', '4', '8', '1', '7', '3', '6']
2019-11-23T11:42:23+0000 root         DEBUG    Returning closest cache: root://osg-chicago-stashcache.nrp.internet2.edu
2019-11-23T11:42:23+0000 root         DEBUG    Ordered list of nearest caches: [u'root://osg-chicago-stashcache.nrp.internet2.edu', u'root://sc-cache.chtc.wisc.edu', u'root://osg-kansas-city-stashcache.nrp.internet2.edu', u'root://osg-gftp.pace.gatech.edu', u'root://its-condor-xrootd1.syr.edu', u'root://stashcache.t2.ucsd.edu', u'root://osg-new-york-stashcache.nrp.internet2.edu', u'root://fiona.uvalight.net']
2019-11-23T11:42:23+0000 root         DEBUG    About to run curl command: curl -v -L --connect-timeout 30 --speed-limit 1024 -O --fail http://osg-chicago-stashcache.nrp.internet2.edu:8000/user/npavlovikj/public/1574462944/00/0D/DRR106903_transposed_report.tsv
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0* About to connect() to osg-chicago-stashcache.nrp.internet2.edu port 8000 (#0)
*   Trying 163.253.70.2...
* Connected to osg-chicago-stashcache.nrp.internet2.edu (163.253.70.2) port 8000 (#0)
> GET /user/npavlovikj/public/1574462944/00/0D/DRR106903_transposed_report.tsv HTTP/1.1
> User-Agent: curl/7.29.0
> Host: osg-chicago-stashcache.nrp.internet2.edu:8000
> Accept: */*
> 
< HTTP/1.1 302 Redirect
< Connection: Keep-Alive
< Content-Length: 0
< Location: http://osg.chic.nrp.internet2.edu:9001/user/npavlovikj/public/1574462944/00/0D/DRR106903_transposed_report.tsv
< 
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
* Connection #0 to host osg-chicago-stashcache.nrp.internet2.edu left intact
* Issue another request to this URL: 'http://osg.chic.nrp.internet2.edu:9001/user/npavlovikj/public/1574462944/00/0D/DRR106903_transposed_report.tsv'
* About to connect() to osg.chic.nrp.internet2.edu port 9001 (#1)
*   Trying 163.253.70.2...
* Connected to osg.chic.nrp.internet2.edu (163.253.70.2) port 9001 (#1)
> GET /user/npavlovikj/public/1574462944/00/0D/DRR106903_transposed_report.tsv HTTP/1.1
> User-Agent: curl/7.29.0
> Host: osg.chic.nrp.internet2.edu:9001
> Accept: */*
> 
  0     0    0     0    0     0      0      0 --:--:--  0:00:30 --:--:--     0* Operation too slow. Less than 1024 bytes/sec transferred the last 30 seconds
  0     0    0     0    0     0      0      0 --:--:--  0:00:30 --:--:--     0
* Closing connection 1
curl: (28) Operation too slow. Less than 1024 bytes/sec transferred the last 30 seconds
2019-11-23T11:42:53+0000 root         DEBUG    About to run curl command: curl -v -L --connect-timeout 30 --speed-limit 1024 -O --fail http://sc-cache.chtc.wisc.edu:8000/user/npavlovikj/public/1574462944/00/0D/DRR106903_transposed_report.tsv
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0* About to connect() to sc-cache.chtc.wisc.edu port 8000 (#0)
*   Trying 128.104.100.50...
* Connected to sc-cache.chtc.wisc.edu (128.104.100.50) port 8000 (#0)
> GET /user/npavlovikj/public/1574462944/00/0D/DRR106903_transposed_report.tsv HTTP/1.1
> User-Agent: curl/7.29.0
> Host: sc-cache.chtc.wisc.edu:8000
> Accept: */*
> 
< HTTP/1.1 200 OK
< Connection: Keep-Alive
< Content-Length: 0
< 
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
* Connection #0 to host sc-cache.chtc.wisc.edu left intact
Traceback (most recent call last):
  File "/cvmfs/connect.opensciencegrid.org/modules/packages/linux-rhel7-x86_64/gcc-6.4.0spack/stashcache-5.5.0-pccc2g75vucsb7rrkwnycwefoig3ydxg/bin/stashcp", line 12, in <module>
    load_entry_point('stashcp==5.5.0', 'console_scripts', 'stashcp')()
  File "/cvmfs/connect.opensciencegrid.org/modules/packages/linux-rhel7-x86_64/gcc-6.4.0spack/stashcache-5.5.0-pccc2g75vucsb7rrkwnycwefoig3ydxg/lib/python2.7/site-packages/stashcp/__init__.py", line 706, in main
    result = doStashCpSingle(sourceFile=source, destination=destination, methods = methods, debug=args.debug)
  File "/cvmfs/connect.opensciencegrid.org/modules/packages/linux-rhel7-x86_64/gcc-6.4.0spack/stashcache-5.5.0-pccc2g75vucsb7rrkwnycwefoig3ydxg/lib/python2.7/site-packages/stashcp/__init__.py", line 214, in doStashCpSingle
    if download_http(sourceFile, destination, debug, payload):
  File "/cvmfs/connect.opensciencegrid.org/modules/packages/linux-rhel7-x86_64/gcc-6.4.0spack/stashcache-5.5.0-pccc2g75vucsb7rrkwnycwefoig3ydxg/lib/python2.7/site-packages/stashcp/__init__.py", line 410, in download_http
    dlSz=os.stat(final_destination).st_size
OSError: [Errno 2] No such file or directory: '/tmp/DRR106903_transposed_report.tsv'

This might have been fixed in Curl 7.42 (see https://curl.haxx.se/mail/archive-2015-08/0007.html) but it seems like the curl community still argues what the ciorrect behavior is, and if it should be changed back.

What is interesting about the failure case above is that curl signals success, and thing are failing later in the stashcp code. One approach here would be to fall back to xrootd (as I had it configured with --methods=http,xrootd)

djw8605 commented 4 years ago

Interesting, so curl detects a successful download, but it doesn't write a file?

rynge commented 4 years ago

Yes, the idea was to not start the writing process until data start coming in. If not data comes in, the files not created. On EL7 (curl 7.29), the file is not created. On EL8 (curl 7.62) it is.