roots / acorn

Laravel components for WordPress plugins and themes
https://roots.io/acorn/
MIT License
824 stars 94 forks source link

PR #402 split headers follow-up discussion #404

Closed RafaelKr closed 1 month ago

RafaelKr commented 1 month ago

Summary

Last week I added a comment to #402. I added a comment to the now-merged PR where I'd like to discuss some more because I'm not sure if modifying current behaviour (as the PR currently does) is a good idea:

I built a little Benchmark: https://github.com/RafaelKr/php-header-split-bench Usage and the output/results I got are documented in the README.md.

As you can see by just using explode the header values are actually modified because one leading space is added. Usually (and according to RFC) this shouldn't be a problem, but it definitely modifies the current behavior:

{
    "X-With-Leading-Space": " Test",
    "X-With-Leading-Space-And-Colon": " Test: 123",
    "X-Without-Leading-Space": "Test-Header",
    "Content-Type": " application\/x-www-form-urlencoded",
    "Cache-Control": " private, must-revalidate"
}

The same behavior can also be observed in real usage with Laravel/Acorn.

I think using preg_split wouldn't really impact performance but provide consistent output independent of leading whitespace.

cc @QWp6t

Additional context

No response

QWp6t commented 1 month ago

Hey thanks for the feedback!

These are all valid HTTP headers

x-test:test
x-test: test
x-test:  test
x-test:                        test

My original thought was that if a developer adds a header with leading whitespace, we should keep it intact. Removing leading whitespace would "definitely modify current behavior", as you put it.

Just noting that what you're proposing would override what might potentially be a developer's preference for deliberately including leading whitespaces in their headers. I don't see what value that adds on our end or for our users other than to serve an arbitrary aesthetic preference.

This isn't a strong opinion, though. I can definitely be convinced otherwise. Do you know if WordPress, Laravel, or Symfony already strip leading whitespace in header values?

RafaelKr commented 1 month ago

Ah, I think we got a misunderstanding. The new problem now is that we're now always adding one space.

https://github.com/roots/acorn/blob/ef4a8a2f2719a063dc01c380a78633f3ea46b5db/src/Roots/Acorn/Application/Concerns/Bootable.php#L145-L153

Let's break this down with an example if someone did header('Content-Type: image/png') (notice this is what most people will do, there's exactly one space):

foreach (headers_list() as $header) {
   // $header = 'Content-Type: image/png'
   [$header, $value] = explode(':', $header, 2); 
   // $header = 'Content-Type'
   // $value = ' image/png'

   if (! headers_sent()) { 
       header_remove($header); 
   } 

   $response->header($header, $value, $header !== 'Set-Cookie'); 
}

If we now call $response->headers->get('Content-Type') we will get image/png (with leading space). I would expect to get image/png.

Later the header will be sent out via header($name.': '.$value, $replace, $this->statusCode); at this place: https://github.com/symfony/http-foundation/blob/f68770376c1b1d32914af58a37dc1fdb0e5795a3/Response.php#L346

This results in Content-Type: image/png (notice there are two spaces now, while the user initially set Content-Type: image/png with one space).

Also if someone would use a Laravel Middleware which has a check similar to this str_starts_with($response->headers->get('Content-Type'), 'image/') it would suddenly fail.

I think what we actually would like to do is using [$header, $value] = preg_split("/:\s{0,1}/", $header, 2); instead of explode. No additional trim is needed. This only adds one space if there previously was no space at all. But it keeps every additional space the user actually added. Also according to my updated benchmark repo using /:\s{0,1}/ is even a little more performant than using /:\s*/, at least in the test where a header like X-With-Leading-Space-5: Test is used.

RafaelKr commented 1 month ago

I just created #405.

Little side-note: I'm using the Caddy Webserver for local development. There I couldn't perceive the added space because it was trimmed there. But in my IntelliJ Debug output tab I could see that the headers really contained the one additional space (Content-Type: application/json). Probably Caddy does this due to RFC 7320 section 3.2.4 Field Parsing

[...]

A field value might be preceded and/or followed by optional whitespace (OWS); a single SP preceding the field-value is preferred for consistent readability by humans. The field value does not include any leading or trailing whitespace: OWS occurring before the first non-whitespace octet of the field value or after the last non-whitespace octet of the field value ought to be excluded by parsers when extracting the field value from a header field.

[...]

QWp6t commented 1 month ago

Okay, that makes sense to me. Didn't occur to me that Symfony was already adding a leading space to all headers. So if we're including a leading space in the value, and then Symfony adds their own, then the result is two leading spaces. Thanks for clarifying!

lagiosdi commented 2 weeks ago

I’m encountering an issue due to sending multiple Link headers from different PHP files.

Example:

Link: http://www.mywebsite.com/blue-shirt.html; rel="canonical" Link: ; rel="alternate" For SEO purposes, I need to support multiple Link headers with the same key.

The above change send only one Link Header

oxyc commented 2 weeks ago

Related: https://github.com/roots/acorn/pull/356#issuecomment-1998418883 and https://github.com/roots/acorn/pull/356/commits/c6fd98e1222e2e0f8b38a0571fbc50c7c8dd41a5