dCache / dcache

dCache - a system for storing and retrieving huge amounts of data, distributed among a large number of heterogenous server nodes, under a single virtual filesystem tree with a variety of standard access methods
https://dcache.org
290 stars 136 forks source link

HTTP-TPC push fails with no Content-Length header from destination #5549

Closed vokac closed 4 years ago

vokac commented 4 years ago

When I try to use dCache as a source for HTTP-TPC in push mode and destination doesn't return Content-Lenght than transfer fails with error message "verification failed: remote server failed to provide length after x.yy min"

$ export SRC=https://prometheus.desy.de:2443/VOs/dteam/0b
$ export DST=https://dpmhead-trunk.cern.ch/dpm/cern.ch/home/dteam/0b
$ touch /tmp/0b
$ gfal-copy file:///tmp/0b "$SRC"
Copying file:///tmp/0b   [DONE]  after 0s                                                                                                                                                                                         
$ gfal-rm "$DST"
https://dpmhead-trunk.cern.ch/dpm/cern.ch/home/dteam/0b DELETED
$ export TSRC=$(curl --silent --cert /tmp/x509up_u$(id -u) --key /tmp/x509up_u$(id -u) --cacert /tmp/x509up_u$(id -u) --capath /etc/grid-security/certificates -X POST -H 'Content-Type: application/macaroon-request' -d '{"caveats": ["activity:DOWNLOAD"], "validity": "PT30M"}' "$SRC" | jq -r '.macaroon')
$ export TDST=$(curl --silent --cert /tmp/x509up_u$(id -u) --key /tmp/x509up_u$(id -u) --cacert /tmp/x509up_u$(id -u) --capath /etc/grid-security/certificates -X POST -H 'Content-Type: application/macaroon-request' -d '{"caveats": ["activity:UPLOAD,DELETE,LIST"], "validity": "PT30M"}' "$DST" | jq -r '.macaroon')
$ curl -v --capath /etc/grid-security/certificates -L -X COPY -H 'Secure-Redirection: 1' -H 'X-No-Delegate: 1' -H 'Credentials: none' -H "Authorization: Bearer $TSRC" -H "TransferHeaderAuthorization: Bearer $TDST" -H "Destination: $DST" "$SRC"
* About to connect() to prometheus.desy.de port 2443 (#0)
*   Trying 2001:638:700:1005::1:95...
* Connected to prometheus.desy.de (2001:638:700:1005::1:95) port 2443 (#0)
...
* SSL connection using TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
* Server certificate:
*   subject: CN=prometheus.desy.de,OU=DESY,O=GermanGrid,C=DE
*   start date: Jan 07 14:06:48 2020 GMT
*   expire date: Feb 05 14:06:48 2021 GMT
*   common name: prometheus.desy.de
*   issuer: CN=GridKa-CA,O=GermanGrid,C=DE
> COPY /VOs/dteam/0b HTTP/1.1
> User-Agent: curl/7.29.0
> Host: prometheus.desy.de:2443
> Accept: */*
> Secure-Redirection: 1
> X-No-Delegate: 1
> Credentials: none
> Authorization: Bearer MDAyNWxvY2F0aW9uIE9wdGlvbmFsWy9WT3MvZHRlYW0vMGJdCjAwMThpZGVudGlmaWVyIHY5bW91dkFjCjAwMTVjaWQgaWlkOkhDS1BsN2wrCjAwMWJjaWQgaWQ6MTAwMTsxMDAxO2R0ZWFtCjAwMmJjaWQgYmVmb3JlOjIwMjAtMDktMTJUMjI6MDU6NTcuMzI2MTc1WgowMDE4Y2lkIGhvbWU6L1ZPcy9kdGVhbQowMDFiY2lkIHBhdGg6L1ZPcy9kdGVhbS8wYgowMDFhY2lkIGFjdGl2aXR5OkRPV05MT0FECjAwMmZzaWduYXR1cmUgcIQkujGuaJMGqpQyNtDWujBjr_bB2gG-0Fx3AaWH99kK
> TransferHeaderAuthorization: Bearer dpm-macaroonMDAyOGxvY2F0aW9uIC9kcG0vY2Vybi5jaC9ob21lL2R0ZWFtLzBiCjAwMTZpZGVudGlmaWVyIGNvbmZpZwowMDU1Y2lkIGRuOi9EQz1jaC9EQz1jZXJuL09VPU9yZ2FuaWMgVW5pdHMvT1U9VXNlcnMvQ049dm9rYWMvQ049NjEwMDcxL0NOPVBldHIgVm9rYWMKMDAxM2NpZCBmcWFuOmR0ZWFtCjAwMWFjaWQgZnFhbjpkdGVhbS9OR0lfQ1oKMDAyOGNpZCBwYXRoOi9kcG0vY2Vybi5jaC9ob21lL2R0ZWFtLzBiCjAwMjRjaWQgYWN0aXZpdHk6VVBMT0FELERFTEVURSxMSVNUCjAwMjRjaWQgYmVmb3JlOjIwMjAtMDktMTJUMjM6MDU6NTlaCjAwMmZzaWduYXR1cmUgQ34qHIWNK3Ca0CMU-Zag2-t1DFzEQYvW45v2KVHZxnMK
> Destination: https://dpmhead-trunk.cern.ch/dpm/cern.ch/home/dteam/0b
> 
< HTTP/1.1 202 Accepted
< Date: Sat, 12 Sep 2020 21:37:44 GMT
< X-Clacks-Overhead: GNU Terry Pratchett
< Server: dCache/7.0.0-SNAPSHOT
< Access-Control-Allow-Credentials: true
< Access-Control-Allow-Origin: *
< Content-Type: text/perf-marker-stream
< Transfer-Encoding: chunked
< 
Perf Marker
    Timestamp: 1599946664
    State: 1
    State description: querying file metadata
    Stripe Index: 0
    Total Stripe Count: 1
End
Perf Marker
    Timestamp: 1599946669
    State: 10
    State description: transfer has started
    Stripe Index: 0
    Stripe Start Time: 1599946665
    Stripe Last Transferred: 1599946665
    Stripe Transfer Time: 0
    Stripe Bytes Transferred: 0
    Stripe Status: RUNNING
    Total Stripe Count: 1
End
...
Perf Marker
    Timestamp: 1599946725
    State: 10
    State description: transfer has started
    Stripe Index: 0
    Stripe Start Time: 1599946665
    Stripe Last Transferred: 1599946665
    Stripe Transfer Time: 0
    Stripe Bytes Transferred: 0
    Stripe Status: RUNNING
    Total Stripe Count: 1
End
failure: verification failed: remote server failed to provide length after 4.41 min
* Connection #0 to host prometheus.desy.de left intact

Apache by default strip Content-Lenth header from HEAD response in case it is set to zero. This is valid behaviour, because HTTP specification for Content-Length only says that server MAY return Content-Lenght as a response to HEAD request.

This means it is currently not possible to transfer zero size files with HTTP-TPC push from DPM (Dynafed?), because this storage implementation rely on Apache for HTTP protocol. For future DPM releases workaround was implemented (LCGDM-2947) and for zero size files DPM Apache module returns Content-Length: 00, because this header is not filtered out by Apache core.

paulmillar commented 4 years ago

Thanks for reporting this issue.

I've been thinking about this and (from my understanding) HTTP should guarantee the integrity of the content length in HTTP PUT requests; that is, a conforming server will return a successful status code to a PUT request if and only if it has received the number of bytes the client sent. This is because a PUT request must include the Content-Length header or use chunked encoding, both of which will detect an incomplete transfer. (As it happens, dCache always specifies Content-Length when making a PUT request.)

Therefore, checking the file size (Content-Length) after successfully uploading a file does not make sense. This is especially true since (as you say) the Content-Length header is optional within a HEAD response.