xrootd / xrootd

The XRootD central repository https://my.cdash.org/index.php?project=XRootD
http://xrootd.org
Other
163 stars 151 forks source link

[Regression] Chunked PUT creates empty files with 5.6.1 XrdHTTP #2058

Closed olifre closed 1 year ago

olifre commented 1 year ago

This feels a bit similar to #1411, but the details are different.

Running the following minimal example (based on the Nagios probe used by EGI ARGO monitoring, i.e. https://gitlab.cern.ch/lcgdm/nagios-plugins-webdav/-/blob/master/src/check_webdav), note you may need to adapt the URL for testing yourself:

#!/usr/bin/env python

import os
import pycurl

curl = pycurl.Curl()
#curl.setopt(pycurl.FOLLOWLOCATION, 1)

curl.setopt(pycurl.CAINFO, os.getenv("X509_USER_PROXY"))
curl.setopt(pycurl.SSLCERT, os.getenv("X509_USER_PROXY"))
curl.setopt(pycurl.CAPATH, "/etc/grid-security/certificates")
curl.setopt(pycurl.URL, "https://myserver..com:1094//cephfs/grid/atlas/user/scratch/freyermu/testfile")
curl.setopt(pycurl.HTTPHEADER, ["Transfer-Encoding: chunked"])

TEST_DATA = "This is some text 0x1d bytes."

curl.setopt(curl.VERBOSE, 1)

def print_traffic(msg_kind, msg):
  if msg_kind < 1 or msg_kind > 4:
    return
  if msg_kind % 2 ==0:
    print("[%2d]> %s" % (msg_kind, str(msg)))
  elif msg_kind % 2 == 1:
    print("[%2d]< %s"  % (msg_kind, str(msg)))

curl.setopt(pycurl.DEBUGFUNCTION, lambda x, y: print_traffic(x, y))

class StringReader:
  def __init__(self, data):
    self.pos = 0
    self.data = data
  def read(self, size):
    ret = self.data[self.pos:self.pos+size]
    self.pos=self.pos+size

    if len(ret) == 0:
      self.pos = 0
    return ret

curl.setopt(pycurl.UPLOAD, 1)
stringreader = StringReader(TEST_DATA)
curl.setopt(pycurl.READFUNCTION, stringreader.read)

curl.perform()

print(curl.getinfo(pycurl.HTTP_CODE))

curl.setopt(pycurl.UPLOAD, 0)
curl.perform()

yields the following when run against XRootD 5.5.5:

[ 2]> b'PUT //cephfs/grid/atlas/user/scratch/freyermu/testfile HTTP/1.1\r\nHost: xrootd001.physik.uni-bonn.de:1094\r\nUser-Agent: PycURL/7.45.2 libcurl/8.0.1 (NSS/3.91) OpenSSL/3.0.9 zlib/1.2.13 brotli/1.0.9 zstd/1.5.5 c-ares/1.19.1 libidn2/2.3.4 libssh2/1.11.0 nghttp2/1.51.0 librtmp/2.3\r\nAccept: */*\r\nTransfer-Encoding: chunked\r\nExpect: 100-continue\r\n\r\n'
[ 1]< b'HTTP/1.1 100 Continue\r\n'
[ 1]< b'Connection: Close\r\n'
[ 1]< b'Server: XrootD/v5.5.5\r\n'
[ 4]> b'1d\r\nThis is some text 0x1d bytes.\r\n'
[ 4]> b'0\r\n\r\n'
[ 1]< b'HTTP/1.1 200 OK\r\n'
[ 1]< b'Connection: Keep-Alive\r\n'
[ 1]< b'Server: XrootD/v5.5.5\r\n'
[ 1]< b'Content-Length: 3\r\n'
[ 1]< b'\r\n'
[ 3]< b':-)'
200
[ 2]> b'GET //cephfs/grid/atlas/user/scratch/freyermu/testfile HTTP/1.1\r\nHost: xrootd001.physik.uni-bonn.de:1094\r\nUser-Agent: PycURL/7.45.2 libcurl/8.0.1 (NSS/3.91) OpenSSL/3.0.9 zlib/1.2.13 brotli/1.0.9 zstd/1.5.5 c-ares/1.19.1 libidn2/2.3.4 libssh2/1.11.0 nghttp2/1.51.0 librtmp/2.3\r\nAccept: */*\r\nTransfer-Encoding: chunked\r\n\r\n'
[ 1]< b'HTTP/1.1 200 OK\r\n'
[ 1]< b'Connection: Keep-Alive\r\n'
[ 1]< b'Server: XrootD/v5.5.5\r\n'
[ 1]< b'Content-Length: 29\r\n'
[ 1]< b'\r\n'
[ 3]< b'This is some text 0x1d bytes.'
:-)This is some text 0x1d bytes.

i.e. the file is uploaded successfully and can be retrieved again with expected content.

However, with XRootD 5.6.1, the following can be observed:

[ 2]> b'PUT //cephfs/grid/atlas/user/scratch/freyermu/testfile HTTP/1.1\r\nHost: xrootd001.physik.uni-bonn.de:1094\r\nUser-Agent: PycURL/7.45.2 libcurl/8.0.1 (NSS/3.91) OpenSSL/3.0.9 zlib/1.2.13 brotli/1.0.9 zstd/1.5.5 c-ares/1.19.1 libidn2/2.3.4 libssh2/1.11.0 nghttp2/1.51.0 librtmp/2.3\r\nAccept: */*\r\nTransfer-Encoding: chunked\r\nExpect: 100-continue\r\n\r\n'
[ 1]< b'HTTP/1.1 100 Continue\r\n'
[ 1]< b'Connection: Close\r\n'
[ 1]< b'Server: XrootD/v5.6.1\r\n'
[ 4]> b'1d\r\nThis is some text 0x1d bytes.\r\n'
[ 4]> b'0\r\n\r\n'
[ 1]< b'HTTP/1.1 200 OK\r\n'
[ 1]< b'Connection: Keep-Alive\r\n'
[ 1]< b'Server: XrootD/v5.6.1\r\n'
[ 1]< b'Content-Length: 3\r\n'
[ 1]< b'\r\n'
[ 3]< b':-)'
200
[ 2]> b'GET //cephfs/grid/atlas/user/scratch/freyermu/testfile HTTP/1.1\r\nHost: xrootd001.physik.uni-bonn.de:1094\r\nUser-Agent: PycURL/7.45.2 libcurl/8.0.1 (NSS/3.91) OpenSSL/3.0.9 zlib/1.2.13 brotli/1.0.9 zstd/1.5.5 c-ares/1.19.1 libidn2/2.3.4 libssh2/1.11.0 nghttp2/1.51.0 librtmp/2.3\r\nAccept: */*\r\nTransfer-Encoding: chunked\r\n\r\n'
[ 1]< b'HTTP/1.1 200 OK\r\n'
[ 1]< b'Connection: Keep-Alive\r\n'
[ 1]< b'Server: XrootD/v5.6.1\r\n'
[ 1]< b'Content-Length: 0\r\n'
[ 1]< b'\r\n'
:-)

i.e. the file appears to be uploaded successfully, but is zero length. This also matches what is observed in the backend storage, so it seems the server does not parse the provided length correctly anymore.

amadio commented 1 year ago

Thanks for the detailed report!

ccaffy commented 1 year ago

Hi all,

Thanks @olifre for your detailed test, I could reproduce the problem!

I managed to have it working if I use the header 'X-Transfer-Status: true' in the client request's header:

curl.setopt(pycurl.HTTPHEADER, ["X-Transfer-Status: true"])

This "problem" comes from the following commit that explicitely explains, in the message, that the header 'TransferEncoding: chunked' should not be used in the request: https://github.com/xrootd/xrootd/commit/bf176f3579c56bb1a0421096e10749c957bb65f2

Would that be OK if you use the X-Transfer-Status: true header instead of the TransferEncoding one?

amadio commented 1 year ago

@ccaffy Is this not a bug, then? Please relabel accordingly if the current behavior is expected, although happily creating a zero length file on the other end may end up being a problem. We may need to error out when receiving the unwanted header with a malformed request or similar error.

ccaffy commented 1 year ago

@ccaffy Is this not a bug, then? Please relabel accordingly if the current behavior is expected, although happily creating a zero length file on the other end may end up being a problem. We may need to error out when receiving the unwanted header with a malformed request or similar error.

Hi @amadio,

It's not a bug if we look at the commit, but indeed, we may want to have something to prevent 0-size file to be created on the server side with such a request. @abh3 what do you think ?

olifre commented 1 year ago

HI @ccaffy ,

that is interesting, since using Transfer-Encoding: chunked for requests is the standard way used by the official EGI probe and also documented on the official curl manpage as "to be used for requests". Quoting the manpage of curl:

Passing on a "Transfer-Encoding: chunked" header when doing an HTTP request with a request body, will make curl send the data using chunked encoding.

So it's quite surprising to me that this is not conforming to the standard — where exactly did you find this statement? If this is some RFC, we should open a GGUS issue for EGI monitoring, and file bugs against the libcurl project since they use this explicitly.

In any case, this means that XRootD 5.6 can not be used in EGI-connected infrastructures anymore unless the probe is adapted, so European grid sites can not update to 5.6 for the time being.

ccaffy commented 1 year ago

Hi @olifre ,

@CohenQU did this commit and put in the commit message the following:

This commit updates the handling of HTTP headers to conform to HTTP standards and improve support for chunked transfer encoding and status trailers.

Previously, the server incorrectly expected to receive the "Transfer-Encoding: chunked" header in HTTP requests. However, it has been realized that the "Transfer-Encoding" header is reserved for HTTP responses and should not be anticipated in an HTTP request.

We've now corrected this behavior by removing the check for the "Transfer-Encoding" header in the incoming request. Instead, we now set the internal variable `m_transfer_encoding_chunked` when the client includes the custom header "X-Transfer-Status: true". This change aligns with the original intention, indicating that the client is prepared to receive a response using chunked transfer encoding and a status trailer.

The `m_trailer_headers` variable is still set when the client includes the "TE: trailers" header, indicating its acceptance of trailer fields in the response.

@CohenQU , can you please take a look and tell us whether this commit should be reverted or not?

Thank you very much in advance.

ccaffy commented 1 year ago

From the Mozilla firefox page: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Transfer-Encoding

It looks like they only talk about clients...

olifre commented 1 year ago

From the Mozilla firefox page: [...]

Indeed, additionally, the hard statement is only for HTTP/2 and they of course do have the implicit viewpoint of a web browser client.

amadio commented 1 year ago

From what I read in RFC9112, we may need to actually revert this patch for 5.6.2.

ccaffy commented 1 year ago

From what I read in RFC9112, we may need to actually revert this patch for 5.6.2.

I agree, let's see what @CohenQU says, and we may revert afterwards.

CohenQU commented 1 year ago

My apologies for the delay in my response. I would like to provide some context for the previous commit that was made.

We discovered that some clients were automatically removing the Transfer-Encoding header before sending the request to the server. In order to still give these clients the opportunity to set the internal variable m_transfer_encoding_chunked, and to address the issue here for the EGI probe, I suggest an update to the code segment located in src/XrdHttp/XrdHttpReq.cc on lines 185-192.

Currently, the code reads as follows:

} else if (!strcmp(key, "Expect") && strstr(val, "100-continue")) {
  sendcontinue = true;
} else if (!strcasecmp(key, "TE") && strstr(val, "trailers")) {
  m_trailer_headers = true;
} else if (!strcasecmp(key, "X-Transfer-Status") && strstr(val, "true")) {
  m_transfer_encoding_chunked = true;
  m_status_trailer = true;
}

I suggest modifying it to:

} else if (!strcmp(key, "Expect") && strstr(val, "100-continue")) {
  sendcontinue = true;
} else if (!strcasecmp(key, "TE") && strstr(val, "trailers")) {
  m_trailer_headers = true;
} else if (!strcasecmp(key, "Transfer-Encoding") && strstr(val, "chunked")) {
  m_transfer_encoding_chunked = true; 
} else if (!strcasecmp(key, "X-Transfer-Status") && strstr(val, "true")) {
  m_transfer_encoding_chunked = true;
  m_status_trailer = true;
}

Basically, we still keep the check for X-Transfer-Status: true and add back the check for Transfer-Encoding: chunked. This change should resolve the problems introduced in the previous commit and also address the irregular behavior observed with the EGI probe.

@olifre Let me know if this resolves the issue, @ccaffy @timj let me know if this makes sense.

olifre commented 1 year ago

My apologies for the delay in my response. I would like to provide some context for the previous commit that was made.

Thanks, that is much appreciated :+1: .

We discovered that some clients were automatically removing the Transfer-Encoding header before sending the request to the server.

Can you elaborate which clients are affected? Maybe the clients themselves can be fixed, since X-Transfer-Status is not really standardsized, and it seems strange to have a special header treated in XRootD when it's essentially to work around a broken client implementation, which will still be unusable with other servers (unless fixing it is impossible)?

@olifre Let me know if this resolves the issue,

I'll compile a test build over the weekend and perform several tests against it to check if the probe etc. works again as expected (but code-wise, I don't see why not).

ccaffy commented 1 year ago

@CohenQU thanks a lot for your response :) I'm perfectly fine with your proposal to put back the parsing of the header Transfer-Encoding: chunked.

Please submit your PR so it can be reviewed and integrated in the next XRootD release.

Thanks a lot in advance.

olifre commented 1 year ago

@CohenQU I tried a bit more with 5.6.1, and the following sadly also breaks:

curl -vL -H "Authorization: Bearer $(oidc-token my-token)" -H "X-Transfer-Status: true" -X PUT --upload-file ~/testfile2 https://my-server.physik.uni-bonn.de:1094//data/grid/helmholtz/testfile

This yields:

* using HTTP/1.x
> PUT //data/grid/helmholtz/testfile HTTP/1.1
> Host: xrootd001-dev.physik.uni-bonn.de:1094
> User-Agent: curl/8.0.1
> Accept: */*
> Authorization: Bearer ABC
> X-Transfer-Status: true
> Content-Length: 16
> 
* We are completely uploaded and fine
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
< HTTP/1.1 400 Bad Request
< Connection: Close
< Server: XrootD/v5.6.1
< Content-Length: 24
< 
* Closing connection 0
* TLSv1.3 (IN), TLS alert, close notify (256):
* TLSv1.3 (OUT), TLS alert, close notify (256):
Invalid chunked encoding

i.e. XRootD assumes chunked encoding is used even though it is not requested by the client. This is not fixed by the proposed patch.

ccaffy commented 1 year ago

@CohenQU I tried a bit more with 5.6.1, and the following sadly also breaks:

curl -vL -H "Authorization: Bearer $(oidc-token my-token)" -H "X-Transfer-Status: true" -X PUT --upload-file ~/testfile2 https://my-server.physik.uni-bonn.de:1094//data/grid/helmholtz/testfile

This yields:

* using HTTP/1.x
> PUT //data/grid/helmholtz/testfile HTTP/1.1
> Host: xrootd001-dev.physik.uni-bonn.de:1094
> User-Agent: curl/8.0.1
> Accept: */*
> Authorization: Bearer ABC
> X-Transfer-Status: true
> Content-Length: 16
> 
* We are completely uploaded and fine
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
< HTTP/1.1 400 Bad Request
< Connection: Close
< Server: XrootD/v5.6.1
< Content-Length: 24
< 
* Closing connection 0
* TLSv1.3 (IN), TLS alert, close notify (256):
* TLSv1.3 (OUT), TLS alert, close notify (256):
Invalid chunked encoding

i.e. XRootD assumes chunked encoding is used even though it is not requested by the client. This is not fixed by the proposed patch.

@olifre does it work with XRootD 5.5.5 and the header Transfer-Encoding: chunked ?

olifre commented 1 year ago

@olifre does it work with XRootD 5.5.5 and the header Transfer-Encoding: chunked ?

Interestingly, it does not — but also specifying Transfer-Encoding: chunked without X-Transfer-Status: true fails with curl in both 5.5.5 and 5.6.1. So it seems the chunked encoding curl does when used via CLI is incompatible with what XRootD expected in general. So this part is indeed not really a regression, however, it's quite unintuitive that X-Transfer-Status: true implies a specific Transfer-Encoding (and even if Content-Length is specified, XRootD now assumes the client uses chunked enconding).

olifre commented 1 year ago

I'll compile a test build over the weekend and perform several tests against it to check if the probe etc. works again as expected (but code-wise, I don't see why not).

My test machine was faster than expected :smile: . I can already confirm the proposed patch restores compatibility with the EGI Nagios probe and the usual standard upload / download tests I do still work. :+1:

xrootd-dev commented 1 year ago

Sorry for the late reply but I was traveling. Interestingly, I've seen quite a few servers creating zero length files when there was a prolem with a header. So, I can't say there is any standard of what should be done here. In general, I would try very hard not to create zero length files when there is an expectation that an upload should succeed. That said, even xroot protocol does not define the behaviour and the server can leave zero length files if the transfer violated some aspect of the protocol after the file was created (i.e. the first transfer request). In xroot that makes sense because we have an explicit open() while http does not. So I'd say if you detect a header error prior to creating a file then you should not create a zero length file. However, I don't see how you can always detect that as you don't necessarily read all of the headers, right?

Andy

On Fri, 14 Jul 2023, ccaffy wrote:

@ccaffy Is this not a bug, then? Please relabel accordingly if the current behavior is expected, although happily creating a zero length file on the other end may end up being a problem. We may need to error out when receiving the unwanted header with a malformed request or similar error.

Hi @amadio,

It's not a bug if we look at the commit, but indeed, we may want to have something to prevent 0-size file to be created on the server side with such a request. @abh3 what do you think ?

-- Reply to this email directly or view it on GitHub: https://github.com/xrootd/xrootd/issues/2058#issuecomment-1635764296 You are receiving this because you are subscribed to this thread.

Message ID: @.***>

######################################################################## Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1

CohenQU commented 1 year ago

@ccaffy @olifre @amadio. I was at a conference for the past week and only have limited access to Github, sorry for the late update again. The PR #2059 with the proposed patch update is ready for review now.

amadio commented 1 year ago

Fixed by #2059.