SWI-Prolog / packages-http

The SWI-Prolog HTTP server and client libraries
23 stars 23 forks source link

Newer versions of SSL picky about attempts to read from options. #157

Closed GavinMendelGleason closed 1 year ago

GavinMendelGleason commented 1 year ago

With Ubuntu 22.10, SSL libraries are more restrictive about closed connections. This has lead to the inability to use the current http_get in conjunction with method(options). Our tus library has written the following helper predicate as a work-around, but it would be nice if methods which require no data could avoid reading from the stream after headers are parsed.

Our original call was:

tus_options(Endpoint, Tus_Options, Options) :-
    http_get(Endpoint, _, [
                 method(options),
                 reply_header(Pre_Options),
                 status_code(204)
                 |Options
             ]),
    tus_process_options(Pre_Options, Tus_Options).

The error returned without this work-around is: ssl_error('0A000126','SSL routines','','unexpected eof while reading')

Our work-around is:

tus_options(Endpoint, Tus_Options, Options) :-
    http_client:headers_option(Options, Options1, Headers),
    option(reply_header(Headers), Options, _),
    http_client:http_open(Endpoint, In, [method(options),
                                         status_code(204)
                                         |Options1]),
    close(In),
    tus_process_options(Headers, Tus_Options).

This is essentially en lieu of:

http_get(URL, Data, Options) :-
    headers_option(Options, Options1, Headers),
    option(reply_header(Headers), Options, _),
    http_open(URL, In, Options1),
    delete(Headers1, transfer_encoding(_), Headers2),
    call_cleanup(
        http_read_data(In, Headers2, Data, Options),
        close(In)).
GavinMendelGleason commented 1 year ago

By the way, we thought about making a PR, but we were unsure what to return as Data. If we could leave it unbound it should be trivial to make use of our workaround as one of the clauses of http_read_data perhaps.

JanWielemaker commented 1 year ago

I'd be happy returning the atom ''. I'm still a little unsure why this happens though. If I try an options request on an NGINX server it nicely returns

HTTP/2 200 
server: nginx/1.14.0 (Ubuntu)
date: Thu, 09 Feb 2023 12:52:42 GMT
content-length: 0
access-control-allow-origin: *
access-control-allow-methods: GET, POST, PUT, DELETE

The content-length: 0 should be enough to stop http_get trying to read something (and would it make return '', which is why I think that is a good reply). Is the server you are talking to not replying with a content-length: 0 header?

Note that you may also consider to pass connection('Keep-alive'). Especially if you use the pre-flight OPTIONS method.

GavinMendelGleason commented 1 year ago

It's the client SSL end of the equation that is the one getting testy in newer libraries. We were able to use options without incident previously, but the problem is now reproducible in newer Debian and Ubuntu.

Content-Length: 0 does in fact work (we tested this) but unfortunately some intermediate proxies like to strip this from option calls (such as Cloudflare) and there appears to be no remedy.

That's a good point regards keep-alive!

JanWielemaker commented 1 year ago

Should be addressed by 2ecd7e1463fef1dc08371795f6c128f0ae34b353. I think it is a server error though. It should either send a Content-length: 0 header or properly handle reading to end-of-file. Please let me know if there are further remarks/issues with this.

matko commented 1 year ago

While it's hard to find out why exactly cloudflare would be stripping the Content-Length header, my suspicion is because we were returning a 204 status code, which according to rfc9110 is not allowed to be used in conjunction with the Content-Length header (https://httpwg.org/specs/rfc9110.html#field.content-length).

A server MUST NOT send a Content-Length header field in any response
with a status code of [1xx
(Informational)](https://httpwg.org/specs/rfc9110.html#status.1xx) or
[204 (No
Content)](https://httpwg.org/specs/rfc9110.html#status.204). A server
MUST NOT send a Content-Length header field in any [2xx
(Successful)](https://httpwg.org/specs/rfc9110.html#status.2xx)
response to a CONNECT request ([Section
9.3.6](https://httpwg.org/specs/rfc9110.html#CONNECT)).

So I don't think relying on content-length being there is proper, as there are cases where the spec forbids this header, and it looks like some intermediate proxies take this seriously.

Furthermore, an OPTIONS request should always result in an empty body, regardless of what Content-Length claims, though I suppose servers could be non-compliant and a good client library should be able to deal with a body coming back anyway.

JanWielemaker commented 1 year ago

I see. So, I think that rather checking the method (as we do now), we should check the status code and for 1xx and 204 do the normal thing if there is a content-length (which would mean the server is not correct, but on the other hand it explicitly says there is content, so we must read it, particularly so if we are in Keep-Alive mode) and read nothing if there is not. Would you agree?

matko commented 1 year ago

Yeah, I agree. I also think though that the client should be able to interpret a connection being closed after the headers are sent as an implicit no-content. This case is common enough that google is littered with people running into this kind of issue. SSL throwing an error in this case is just unfortunate. So in addition to considering 204 and 1xx as no-content cases, can't we also just catch that unexpected eof error coming from the ssl library, and treat it as if Data was ''?

JanWielemaker commented 1 year ago

I'm very reluctant interpreting errors as no-data. If an application wants that, use http_open/3, use the headers and deal with errors the way you want while reading the body.

I'm happy not to rely on the method though. I didn't like that at all (but sometimes you have to do such things).

JanWielemaker commented 1 year ago

Ok, see take two at 17a093f4a658b44319cb27d8da6eb84dce7fb04d. Hope this fixes the issue. Surely it is better to understand why it is required.

matko commented 1 year ago

Yup that all makes logical sense to me. Thanks a lot :).