Currently, the Host header is always sent last because it is added automatically on wpull.protocol.http.request.Request.prepare_for_send after the other headers were already set. I propose to change this to always send the Host header line first.
Theoretically, this shouldn't matter. The order of header lines is not significant in HTTP. From RFC 7230 section 3.2.2:
The order in which header fields with differing field names are
received is not significant. However, it is good practice to send
header fields that contain control data first, such as Host on
requests and Date on responses, so that implementations can decide
when not to handle a message as early as possible.
Unfortunately, it appears that Cloudflare is (since recently?) treating requests where the Host header doesn't come first differently.
Example of different header order producing different results on Cloudflare with curl
```
> curl -A 'Mozilla/5.0 (Windows NT 6.1; rv:60.0) Gecko/20100101 Firefox/60.0' https://bund.lkr.de/ -sv --http1.1
[snip]
> GET / HTTP/1.1
> Host: bund.lkr.de
> User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:60.0) Gecko/20100101 Firefox/60.0
> Accept: */*
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
< HTTP/1.1 307 Temporary Redirect
< Date: Mon, 27 Sep 2021 22:35:35 GMT
< Content-Type: text/html;charset=UTF-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< location: /start/
< CF-Cache-Status: DYNAMIC
< Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
< Report-To: [snip]
< NEL: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
< Strict-Transport-Security: max-age=0; includeSubDomains; preload
< X-Content-Type-Options: nosniff
< Server: cloudflare
< CF-RAY: [snip]
< alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400, h3-28=":443"; ma=86400, h3-27=":443"; ma=86400
<
* Connection #0 to host bund.lkr.de left intact
> curl -A 'Mozilla/5.0 (Windows NT 6.1; rv:60.0) Gecko/20100101 Firefox/60.0' -H 'Host:' -H 'Host: bund.lkr.de' https://bund.lkr.de/ -sv --http1.1
[snip]
> GET / HTTP/1.1
> User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:60.0) Gecko/20100101 Firefox/60.0
> Accept: */*
> Host: bund.lkr.de
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
< HTTP/1.1 503 Service Temporarily Unavailable
< Date: Mon, 27 Sep 2021 22:35:45 GMT
< Content-Type: text/html; charset=UTF-8
< Transfer-Encoding: chunked
< Connection: close
< X-Frame-Options: SAMEORIGIN
< Permissions-Policy: accelerometer=(),autoplay=(),camera=(),clipboard-read=(),clipboard-write=(),fullscreen=(),geolocation=(),gyroscope=(),hid=(),interest-cohort=(),magnetometer=(),microphone=(),payment=(),publickey-credentials-get=(),screen-wake-lock=(),serial=(),sync-xhr=(),usb=()
< Cache-Control: private, max-age=0, no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Expires: Thu, 01 Jan 1970 00:00:01 GMT
< Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
< Report-To: [snip]
< NEL: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
< Strict-Transport-Security: max-age=0; includeSubDomains; preload
< X-Content-Type-Options: nosniff
< Server: cloudflare
< CF-RAY: [snip]
< alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400, h3-28=":443"; ma=86400, h3-27=":443"; ma=86400
<
Just a moment...
[snip]
```
`-H 'Host:' -H 'Host: bund.lkr.de'` first removes the header and then adds it again, forcing it to be at the end. The 307 is the expected response for this site, the 503 is the Cloudflare JS challenge.
Currently, the
Host
header is always sent last because it is added automatically onwpull.protocol.http.request.Request.prepare_for_send
after the other headers were already set. I propose to change this to always send theHost
header line first.Theoretically, this shouldn't matter. The order of header lines is not significant in HTTP. From RFC 7230 section 3.2.2:
Unfortunately, it appears that Cloudflare is (since recently?) treating requests where the
Host
header doesn't come first differently.Example of different header order producing different results on Cloudflare with curl
``` > curl -A 'Mozilla/5.0 (Windows NT 6.1; rv:60.0) Gecko/20100101 Firefox/60.0' https://bund.lkr.de/ -sv --http1.1 [snip] > GET / HTTP/1.1 > Host: bund.lkr.de > User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:60.0) Gecko/20100101 Firefox/60.0 > Accept: */* > * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4): * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4): * old SSL session ID is stale, removing < HTTP/1.1 307 Temporary Redirect < Date: Mon, 27 Sep 2021 22:35:35 GMT < Content-Type: text/html;charset=UTF-8 < Transfer-Encoding: chunked < Connection: keep-alive < location: /start/ < CF-Cache-Status: DYNAMIC < Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct" < Report-To: [snip] < NEL: {"success_fraction":0,"report_to":"cf-nel","max_age":604800} < Strict-Transport-Security: max-age=0; includeSubDomains; preload < X-Content-Type-Options: nosniff < Server: cloudflare < CF-RAY: [snip] < alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400, h3-28=":443"; ma=86400, h3-27=":443"; ma=86400 < * Connection #0 to host bund.lkr.de left intact > curl -A 'Mozilla/5.0 (Windows NT 6.1; rv:60.0) Gecko/20100101 Firefox/60.0' -H 'Host:' -H 'Host: bund.lkr.de' https://bund.lkr.de/ -sv --http1.1 [snip] > GET / HTTP/1.1 > User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:60.0) Gecko/20100101 Firefox/60.0 > Accept: */* > Host: bund.lkr.de > * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4): * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4): * old SSL session ID is stale, removing < HTTP/1.1 503 Service Temporarily Unavailable < Date: Mon, 27 Sep 2021 22:35:45 GMT < Content-Type: text/html; charset=UTF-8 < Transfer-Encoding: chunked < Connection: close < X-Frame-Options: SAMEORIGIN < Permissions-Policy: accelerometer=(),autoplay=(),camera=(),clipboard-read=(),clipboard-write=(),fullscreen=(),geolocation=(),gyroscope=(),hid=(),interest-cohort=(),magnetometer=(),microphone=(),payment=(),publickey-credentials-get=(),screen-wake-lock=(),serial=(),sync-xhr=(),usb=() < Cache-Control: private, max-age=0, no-store, no-cache, must-revalidate, post-check=0, pre-check=0 < Expires: Thu, 01 Jan 1970 00:00:01 GMT < Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct" < Report-To: [snip] < NEL: {"success_fraction":0,"report_to":"cf-nel","max_age":604800} < Strict-Transport-Security: max-age=0; includeSubDomains; preload < X-Content-Type-Options: nosniff < Server: cloudflare < CF-RAY: [snip] < alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400, h3-28=":443"; ma=86400, h3-27=":443"; ma=86400 <