axel-download-accelerator / axel

Lightweight CLI download accelerator
GNU General Public License v2.0
3.02k stars 267 forks source link

Consider upgrading HTTP requests from HTTP/1.0 #442

Closed pzygielo closed 4 months ago

pzygielo commented 4 months ago

I was surprised with axel not meeting my expectations when downloading from lighttpd/1.4.76. It turned out that for GET ... HTTP/1.0 lighttpd responds with HTTP/1.0 200 OK (which sets axel->...->supported to false as Content-Range header is not being sent then). On the other hand exactly the same request but GET ... HTTP/1.1 results in HTTP/1.1 206 Partial Content (with Content-Range) as 206 is known in HTTP/1.1.

Previously https://github.com/axel-download-accelerator/axel/blob/f71d8746a28d3351fb2db48873cace8620a50d48/ChangeLog#L835-L838 but then https://github.com/axel-download-accelerator/axel/blob/f71d8746a28d3351fb2db48873cace8620a50d48/ChangeLog#L812-L813

Some hardcoded HTTP versions: https://github.com/axel-download-accelerator/axel/blob/f71d8746a28d3351fb2db48873cace8620a50d48/src/http.c#L169 https://github.com/axel-download-accelerator/axel/blob/f71d8746a28d3351fb2db48873cace8620a50d48/src/http.c#L172 https://github.com/axel-download-accelerator/axel/blob/f71d8746a28d3351fb2db48873cace8620a50d48/src/http.c#L177

ismaell commented 4 months ago

Technically that's a sever-side issue. Also, these ChangeLog entries are self-explanatory and from 2001, nobody implemented chunk-encoding, thus no HTTP/1.1 support.

This is a duplicate of #328.

pzygielo commented 4 months ago

Technically that's a sever-side issue.

I don't think it is. axel sends HTTP/1.0 request and this is negotiated to, as not supporting partial content response.

Also, these ChangeLog entries are self-explanatory

Yes, there even is: because in fact HTTP/1.0 does not support Ranges.

and from 2001, nobody implemented chunk-encoding, thus no HTTP/1.1 support.

Then - I don't understand the claim:

Version 0.99 brings you ... Axel's RFC compliant.

pzygielo commented 4 months ago

This is a duplicate of #328.

I'll subscribe to that then. Thanks for the reference.

ismaell commented 4 months ago

Technically that's a sever-side issue.

I don't think it is. axel sends HTTP/1.0 request and this is negotiated to, as not supporting partial content response.

That's a server-side issue by definition. It could answer the request with 206, but it doesn't.

Also, these ChangeLog entries are self-explanatory

Yes, there even is: because in fact HTTP/1.0 does not support Ranges.

It isn't standard-compliant, sure. OTOH, most features get implemented then standardised, and both sides rarely match 100%.

Then - I don't understand the claim:

Version 0.99 brings you ... Axel's RFC compliant.

Me neither, like most things, it was there and nobody stopped to think about it and/or change it. It probably meant "now it's less broken".

pzygielo commented 4 months ago

I don't think it is. axel sends HTTP/1.0 request and this is negotiated to, as not supporting partial content response.

That's a server-side issue by definition. It could answer the request with 206, but it doesn't.

I see that for example nginx does that. It responds with HTTP/1.1 and 206 for axel's 1.0 request. I understand that everything accessed by https (in my case it's plain, unsecured http) is at least HTTP/1.1 as well, so perhaps HTTP/1.0 might be interpreted differently by servers in https case.

But it's not clear for me, that if client declares language to be used as 1.0 in first step, and then uses 1.1 words/grammar, then the server should upgrade to 1.1. I'd be happy to learn that this shall happen. Perhaps you know the paragraph in proper RFC already, otherwise I'll need to study that case.

Just for completeness, here's test data I used as request (no axel involved here):

GET /date.txt HTTP/${HTTP_VERSION_X}
Host: localhost
User-Agent: curl/8.6.0
Accept: */*
Range: bytes=0-

(with HTTP_VERSION_X in 1.0 1.1) and responses:

===== 1.0 =====
HTTP/1.0 200 OK
Content-Type: text/plain;charset=utf-8
ETag: "1444196374"
Last-Modified: Sun, 21 Jul 2024 17:40:40 GMT
Content-Length: 30
Connection: close
Date: Mon, 22 Jul 2024 17:32:55 GMT
Server: lighttpd/1.4.76

Sun 21 Jul 19:40:40 CEST 2024
===== 1.1 =====
HTTP/1.1 206 Partial Content
Content-Type: text/plain;charset=utf-8
ETag: "1444196374"
Last-Modified: Sun, 21 Jul 2024 17:40:40 GMT
Content-Length: 30
Accept-Ranges: bytes
Content-Range: bytes 0-29/30
Date: Mon, 22 Jul 2024 17:32:55 GMT
Server: lighttpd/1.4.76

Sun 21 Jul 19:40:40 CEST 2024

As shown, the responses follow the HTTP version from request. And then, axel marks the connection as Range-not supported. But the server supports that. axel is unable to detect that because it says that is speaks 1.0. That's how I see it.

Thanks for checking.

pzygielo commented 4 months ago

But the server supports that. axel is unable to detect that because it says that is speaks 1.0

Even if detected as supported - that would probably not work in next axel's steps, if it would send HTTP/1.0 request for parts - as the mentioned server would ignore it again.

ismaell commented 4 months ago

Nothing to do with HTTPS. No idea if it's standard server-side behaviour, but plenty of servers upgrade 1.0 clients to 1.1 when they request something that could be serviced in a standard way in HTTP/1.1. It's up to the client to bail out and Axel doesn't. There's nothing on the standard about ignoring Request-Range on HTTP/1.0 requests, and there's explicit mentions about other headers in similar scenarios, like Upgrade header, so I guess the behaviour is fine.

pzygielo commented 4 months ago

Nothing to do with HTTPS.

My mistake to mention it.


1.1 is the lowest supported protocol for some servers, so they match the best they can, for 1.0 request. As they don't know how-to-1.0, they do 1.1. They read request as 1.1 and understand Range: header.

No idea if it's standard server-side behaviour, but plenty of servers upgrade 1.0 clients to 1.1 when they request something that could be serviced in a standard way in HTTP/1.1. It's up to the client to bail out and Axel doesn't.

I suppose, those servers just can't do 1.0, so it looks like upgrade.

RFC-9110:

The minor version advertises the sender's communication capabilities even when the sender is only using a backwards-compatible subset of the protocol, thereby letting the recipient know that more advanced features can be used in response (by servers) or in future requests (by clients).

My understanding is, that in described case, the server knows how to speak 1.0 and honours that agent's request. It reads 1.0 and writes 1.0. (It also reads 1.1 and writes 1.1, but not when communicating with axel.)

There's nothing on the standard about ignoring Request-Range on HTTP/1.0 requests

With the above, as the response is 1.0, it can not use 1.1 feature. The response to be 1.0 is determined not by presence of Range: header, but by version in request.

and there's explicit mentions about other headers in similar scenarios, like Upgrade header, so I guess the behaviour is fine.

RFC-9110:

A server that receives an Upgrade header field in an HTTP/1.0 request MUST ignore that Upgrade field.


I think I shared everything I collected for this issue, and I consider my work to be done here.

Thank you for your time.

ismaell commented 4 months ago

There's wiggle room for several interpretations, but it's all meaningless, it works for most servers, and where it doesn't work Axel just needs to get HTTP/1.1 support implemented; do you have a patch?

BTW, what's missing is chunked transfer encoding support. Anyway, feel free to implement it.

pzygielo commented 4 months ago

There's wiggle room for several interpretations, but it's all meaningless, it works for most servers

Yeah.

Anyway, feel free to implement it.

Ha, ha. :smile:

There is still more for me to learn about http spec, than I planned initially on Saturday evening :grin: