Nyholm / psr7-server

Helper classes to use any PSR7 implementation as your main request and response
MIT License
90 stars 21 forks source link

Header with only numbers seems to cause exception #35

Closed nickdnk closed 4 years ago

nickdnk commented 4 years ago

Hello

Our server received some scripted attack attempts aimed at Wordpress (which we don't run though). These requests have headers that look like this:

0: Accept-Language: en-US,en;q=0.5
1: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36

Which seems to cause the "Header name must be an RFC 7230 compatible string." exception to be thrown. It appears to be because of the header regex not accepting this, in https://github.com/Nyholm/psr7/blob/master/src/MessageTrait.php.

Is this a bug? This approach also causes a 500 to be thrown for a client error, which you'd expect to be a 400. I'm using Slim, so maybe this needs to be fixed there. I just wanted to check in here and see if you had any feedback before I bring it up with Slim.

Edit: It's thrown for any header that consists only of numbers.

Nyholm commented 4 years ago

Hey.

Excellent question. You see this exception because your framework added a header name that which was not a string or an empty string.

According to RFC 7230 we cannot accept a header name that is not a string.


The issue seams to be with the header parsing in Slim. If you can provide a stack trace it would be easier to debug.

nickdnk commented 4 years ago

I figured as much. Here's a stack:

InvalidArgumentException Header name must be an RFC 7230 compatible string. 
    /var/app/current/vendor/nyholm/psr7/src/MessageTrait.php:94 Nyholm\Psr7\ServerRequest::withAddedHeader
    /var/app/current/vendor/nyholm/psr7-server/src/ServerRequestCreator.php:68 Nyholm\Psr7Server\ServerRequestCreator::fromArrays
    /var/app/current/vendor/nyholm/psr7-server/src/ServerRequestCreator.php:54 Nyholm\Psr7Server\ServerRequestCreator::fromGlobals
    /var/app/current/vendor/slim/slim/Slim/Factory/Psr17/ServerRequestCreator.php:46 Slim\Factory\Psr17\ServerRequestCreator::createServerRequestFromGlobals
    /var/app/current/vendor/slim/slim/Slim/Factory/Psr17/SlimHttpServerRequestCreator.php:48 Slim\Factory\Psr17\SlimHttpServerRequestCreator::createServerRequestFromGlobals
    /var/app/current/vendor/slim/slim/Slim/App.php:192 Slim\App::run
Nyholm commented 4 years ago

Do you know if it is getallheaders() or static::getHeadersFromServer() that provides $headers with data?

https://github.com/Nyholm/psr7-server/blob/master/src/ServerRequestCreator.php#L52

Nyholm commented 4 years ago

I think we need to add some extra checks here: https://github.com/Nyholm/psr7-server/blob/master/src/ServerRequestCreator.php#L68

Check if header name is string (or maybe force it to be). @Zegnat do you think we should drop invalid headers? Or should an exception be thrown?..

I'll move this issue to nyholm/psr7-server

nickdnk commented 4 years ago

In my case getallheaders() works, so I'm assuming that that function is called.

Perhaps @l0gicgate can help with this.

l0gicgate commented 4 years ago

The ServerRequest object is being created by Nyholm/psr7-server here as we can see in the stack trace. If I had to guess I’d say that it may have to go with getallheaders(). Is that being polyfilled? Are you using Apache or Nginx to serve content @nickdnk

nickdnk commented 4 years ago

@l0gicgate Nginx alpine image in a docker environment when testing, but my Apache production environment does exactly the same thing. I'm not sure if that helps you. It should be pretty trivial to test though, just load up a sample Slim project and send the above headers.

l0gicgate commented 4 years ago

@nickdnk you should verify to see if the getallheaders() method is available in both environments. Otherwise it’s falling back on the manual static method. An FYI you can add a polyfill for getallheaders()

https://github.com/ralouphie/getallheaders

Zegnat commented 4 years ago

Oh interesting. So it seems like either getallheaders() or our polyfill for it (which is very close to the one by ralouphie, so loading that one would probably not help) is getting the faulty headers. Which in turn means that the web server (Nginx) is accepting non-HTTP compliant headers and passing them on to the PHP application?

Personally I would have loved to not have to care about this, and leave it up to the web server to respond with a 400 error when it gets send non-HTTP headers. But seeing as this issue now exists, clearly web servers cannot be trusted to do request validation. (Or maybe this is valid in a more recent HTTP RFC? I have not gone to check.)

This approach also causes a 500 to be thrown for a client error, which you'd expect to be a 400.

I would also expect a 400, but that is up to your application logic. Our library can’t do much better than throw an exception. Any unhandled exceptions are always turned into 500 errors because the application logic stopped on them.

@Zegnat do you think we should drop invalid headers? Or should an exception be thrown?..

I am split on this.

I think ServerRequestCreator::fromArrays should always throw the exception, because the caller of that method would have provided faulty header names in the $headers array. Here it even makes sense for it to be an InvalidArgumentException.

But when calling ServerRequestCreator::fromGlobals I can go both ways. On the one hand users may expect psr7-server to handle common errors and Just Work™️, e.g. by silently dropping them. Alternatively throwing a clear exception means the user can transform it into a 400 response back to the client.

nickdnk commented 4 years ago

This approach also causes a 500 to be thrown for a client error, which you'd expect to be a 400.

I would also expect a 400, but that is up to your application logic. Our library can’t do much better than throw an exception. Any unhandled exceptions are always turned into 500 errors because the application logic stopped on them.

This problem with this, in Slim's case, is that the error handler is not instantiated at the time this exception is thrown, so you'd have to wrap your entire Slim application in a try/catch only to handle this. That, or Slim needs to catch this exception whenever it uses the PSR7-component, so it can be passed to the built-in error handler as with any other code that throws an \InvalidArgumentException from within the Slim app. As a Slim user I have no way of handling this error as it's not converted to a HttpBadRequestException or something similar (https://github.com/slimphp/Slim/blob/4.x/Slim/Exception/HttpBadRequestException.php). It just crashes the framework.

I think given the fact that this happens on both Nginx Docker and my live Apache environment, it should probably be handled whether or not I have polyfills. If it crashes my framework on two different environments there's a good chance it will crash for a lot of users, so it needs to be fixed some other way than inspecting the getallheaders() function, would you not agree? Or am I misunderstanding your reasoning behind exploring this avenue for debugging?

l0gicgate commented 4 years ago

@nickdnk see https://github.com/slimphp/Slim-Skeleton/blob/master/public/index.php#L61 this is how we handle out of app exceptions while still using the Slim handler.

nickdnk commented 4 years ago

@nickdnk see https://github.com/slimphp/Slim-Skeleton/blob/master/public/index.php#L61 this is how we handle out of app exceptions while still using the Slim handler.

Aha. That helps. Thanks.

Zegnat commented 4 years ago

That, or Slim needs to catch this exception whenever it uses the PSR7-component, so it can be passed to the built-in error handler as with any other code that throws an \InvalidArgumentException from within the Slim app. As a Slim user I have no way of handling this error as it's not converted to a HttpBadRequestException or something similar (https://github.com/slimphp/Slim/blob/4.x/Slim/Exception/HttpBadRequestException.php). It just crashes the framework.

psr7-server is not built for Slim but as a generic “simplest” approach to taking a couple of arrays and making it into a PSR-7 object. From where I am standing (and this may sound harsher than it is meant) psr7-server should not care at all about HttpBadRequestException or anything else framework specific.

It is already documented that calling ServerRequestCreator::fromGlobals may throw an InvalidArgumentException. Although we do not mention the specific case of the headers failing as a reason. So it sounds like the implementation in Slim is broken, as it is already ignoring a documented case where an exception may be thrown.

That said …

If it crashes my framework on two different environments there's a good chance it will crash for a lot of users, so it needs to be fixed some other way than inspecting the getallheaders() function, would you not agree? Or am I misunderstanding your reasoning behind exploring this avenue for debugging?

I agree with you that if users are not expecting ServerRequestCreator::fromGlobals to throw exceptions, or are expecting some level of smart error handling, we should be talking about adding such a thing!

I am just not exactly sure where / how. There are many cases where we are calling the methods on the PSR-7 ServerRequest object. Do we need to think of functioning fallbacks for all cases? What is the silent fallback for ServerRequest::withParsedBody failing?

Thinking about that also means we may be letting the psr7-server code base grow into what may be better addressed by frameworks than by a single factory. But maybe it wouldn’t be all too much code in the grand scheme of things. Just something that needs to be thought about. Separation of concerns is a big thing when working with a modular framework, and that is the power of the PSRs.

Zegnat commented 4 years ago

Actually, having started to dig a little more into the code, something isn’t making sense to me. We might be hitting some PHP limitation that I have never before considered.

It is actually completely valid by RFC 7230 to have a header name just be 0. At least from my quick re-reading of it. However, when we create an associated array with header names as keys and header values as values it is PHP that does not support the string "0" as an array item key!

php > $arr = ['0' => 'string'];
php > foreach ($arr as $name => $value){
php { var_dump($name, $value);
php { }
int(0)
string(6) "string"
php >

So when we make sure that only strings can be used as header values, we fail our internal check. This took an interesting turn 🤔

Do note that everything I stated before still applies. ServerRequestCreator::fromGlobals can still throw exceptions, and that can still give issues in Slim. But in this case, it is actually a bug in psr7-server as we should be accepting these headers!

nickdnk commented 4 years ago

That, or Slim needs to catch this exception whenever it uses the PSR7-component, so it can be passed to the built-in error handler as with any other code that throws an \InvalidArgumentException from within the Slim app. As a Slim user I have no way of handling this error as it's not converted to a HttpBadRequestException or something similar (https://github.com/slimphp/Slim/blob/4.x/Slim/Exception/HttpBadRequestException.php). It just crashes the framework.

psr7-server is not built for Slim but as a generic “simplest” approach to taking a couple of arrays and making it into a PSR-7 object. From where I am standing (and this may sound harsher than it is meant) psr7-server should not care at all about HttpBadRequestException or anything else framework specific.

I know. I'm not saying you have to fix this. This part is clearly Slim's job. I've made very few contributions to Slim, so I'm just elaborating on the problem so others (such as @l0gicgate ) have an easier time finding out how to best correct this.

I agree with you that if users are not expecting ServerRequestCreator::fromGlobals to throw exceptions, or are expecting some level of smart error handling, we should be talking about adding such a thing!

I am just not exactly sure where / how. There are many cases where we are calling the methods on the PSR-7 ServerRequest object. Do we need to think of functioning fallbacks for all cases? What is the silent fallback for ServerRequest::withParsedBody failing?

I don't know enough about PSR-7 or this library to comment on this. I just wanted to let you guys know that this is an issue.

php > $arr = ['0' => 'string']; php > foreach ($arr as $name => $value){ php { var_dump($name, $value); php { } int(0) string(6) "string" php>

That's... a problem :)

Zegnat commented 4 years ago

So we may have an interesting edge-case on our hands.

That’s… a problem :)

It gets worse. Multiple zeroes are obviously a string, but '10' is not:

php > $arr = ['0' => 'a', '00' => 'b', '10' => 'c'];
php > foreach ($arr as $name => $value) {
php { echo $name . ': ' . (is_string($name) ? 'yes' : 'no') . "\n";
php { }
0: no
00: yes
10: no

It looks like we may have to treat integers as if we were provided with strings when they come up as part of the $headers array when passed to ServerRequestCreator::fromArrays. Which isn’t too bad, we can simply add a strval() call before passing on the header names to ServerRequest::withAddedHeader.

The worse part is that Nyholm/psr7 then internally changes it into an array key again before validating! So it will fail one step down the chain.

I am almost tempted to change the internal header representation within Nyholm/psr7 to tuples, but that is probably horribly inefficient. But there is no dictionary implementation in PHP apart from associated arrays…

Going to need a little more of a think here.


Edit to add: I hope my thinking aloud here is helpful for somebody other than me. But future me is going to be thankful when messing around trying to write unit tests for all this.

nickdnk commented 4 years ago

It gets worse. Multiple zeroes are obviously a string, but '10' is not:

Obviously. Haha.

Zegnat commented 4 years ago

I should have maybe added an irony mark around that statement 😉 But it is late and I only have so much patience with PHP right now, haha!

Logging off…

nickdnk commented 4 years ago

Hello

So I did a little follow-up research.

As you've already established, @Zegnat, we cannot assign to arrays using numbers or numeric strings as keys in PHP without them being cast to integers. We could do it with a stdClass, but that's probably too much of a change: https://stackoverflow.com/questions/4100488/a-numeric-string-as-array-key-in-php

Given that this can never work with a PHP array, and that most (I assume) frameworks implement the header logic as a $key => $value array, I suggest we loop these values and silently delete keys that return true when passed to ctype_digit(). In https://github.com/Nyholm/psr7-server/blob/master/src/ServerRequestCreator.php line 113 we could simply make this change:

if ($value && 0 === \strpos($key, 'HTTP_')) {
    $name = \strtr(\strtolower(\substr($key, 5)), '_', '-');
    if (!ctype_digit($name)) {
        $headers[$name] = $value;
    }
    continue;
}

And if getallheaders() was used on line 52:

$headers = [];
foreach (getallheaders() as $key => $value) {
    if (!ctype_digit($key)) {
        $headers[$key] = $value;
    }
}

I have confirmed that these changes fix the problem. But of course, the header won't be available. I doubt this will be an issue, as I've never come across a header the consisted only of numbers. And any framework that uses this package and depends on numeric headers is already broken.

Edit: Do correct me if I'm wrong. I'm not very experienced with this particular component.

Zegnat commented 4 years ago

I have confirmed that these changes fix the problem. But of course, the header won't be available. I doubt this will be an issue, as I've never come across a header the consisted only of numbers.

My problem with that solution is that it does not address the underlying issue. Anyone using Nyholm/psr7 outside of psr7-server is still blocked from doing $request->withHeader('0', 'Valid HTTP').

I would rather fix it at the bottom than only here within the abstraction of a factory.

I have been writing some tests today but got sidetracked into a different project. I would however always encourage writing tests first so you know what it is you want to achieve. I got as far as the following:

    public function testSupportNumericHeaderNames()
    {
        // Test function in constructor.
        $r = new Request('GET', '', [
            '200' => 'Content-Length',
        ]);
        // Test getHeaders, getHeader, and getHeaderLine.
        $this->assertSame(['200' => ['Content-Length']], $r->getHeaders());
        $this->assertSame(['Content-Length'], $r->getHeader('200'));
        $this->assertSame('Content-Length', $r->getHeaderLine('200'));
        // Test adding headers after construction.
        $r = $r->withHeader('300', 'C')->withAddedHeader('200', ['A', 'B']);
        $this->assertSame(['200' => ['Content-Length', 'A', 'B'], '300' => ['C']], $r->getHeaders());
        // Test removing header.
        $r = $r->withoutHeader('300');
        $this->assertSame(['200' => ['Content-Length', 'A', 'B']], $r->getHeaders());
    }

(I am using 200 and Content-Length because there is already a test RequestTest::testSupportNumericHeaders that uses those as value and header respectively.)

This test would be passable with surprisingly little code:

diff --git a/src/MessageTrait.php b/src/MessageTrait.php
index 0f7635d..57f7fe7 100644
--- a/src/MessageTrait.php
+++ b/src/MessageTrait.php
@@ -140,6 +140,7 @@ trait MessageTrait
     private function setHeaders(array $headers): void
     {
         foreach ($headers as $header => $value) {
+            if (is_int($header)) $header = strval($header);
             $value = $this->validateAndTrimHeader($header, $value);
             $normalized = self::lowercase($header);
             if (isset($this->headerNames[$normalized])) {

With that we would not have to skip the headers at all and can include them.

Not really poured over that test yet, so if I missed a case I would love to hear it!

nickdnk commented 4 years ago

I have confirmed that these changes fix the problem. But of course, the header won't be available. I doubt this will be an issue, as I've never come across a header the consisted only of numbers.

My problem with that solution is that it does not address the underlying issue. Anyone using Nyholm/psr7 outside of psr7-server is still blocked from doing $request->withHeader('0', 'Valid HTTP').

I would rather fix it at the bottom than only here within the abstraction of a factory.

I have been writing some tests today but got sidetracked into a different project. I would however always encourage writing tests first so you know what it is you want to achieve. I got as far as the following:

    public function testSupportNumericHeaderNames()
    {
        // Test function in constructor.
        $r = new Request('GET', '', [
            '200' => 'Content-Length',
        ]);
        // Test getHeaders, getHeader, and getHeaderLine.
        $this->assertSame(['200' => ['Content-Length']], $r->getHeaders());
        $this->assertSame(['Content-Length'], $r->getHeader('200'));
        $this->assertSame('Content-Length', $r->getHeaderLine('200'));
        // Test adding headers after construction.
        $r = $r->withHeader('300', 'C')->withAddedHeader('200', ['A', 'B']);
        $this->assertSame(['200' => ['Content-Length', 'A', 'B'], '300' => ['C']], $r->getHeaders());
        // Test removing header.
        $r = $r->withoutHeader('300');
        $this->assertSame(['200' => ['Content-Length', 'A', 'B']], $r->getHeaders());
    }

(I am using 200 and Content-Length because there is already a test RequestTest::testSupportNumericHeaders that uses those as value and header respectively.)

This test would be passable with surprisingly little code:

diff --git a/src/MessageTrait.php b/src/MessageTrait.php
index 0f7635d..57f7fe7 100644
--- a/src/MessageTrait.php
+++ b/src/MessageTrait.php
@@ -140,6 +140,7 @@ trait MessageTrait
     private function setHeaders(array $headers): void
     {
         foreach ($headers as $header => $value) {
+            if (is_int($header)) $header = strval($header);
             $value = $this->validateAndTrimHeader($header, $value);
             $normalized = self::lowercase($header);
             if (isset($this->headerNames[$normalized])) {

With that we would not have to skip the headers at all and can include them.

Not really poured over that test yet, so if I missed a case I would love to hear it!

I had not even realised that I had "fixed it" in an entirely different package than the one that caused the problem. That's obviously not acceptable.

I've tested the string-cast solution in my environment, but that alone is not enough. I still get an InvalidArgumentException with Header name must be an RFC 7230 compatible string., this time coming from the validation in withAddedHeader though, which seems to redundantly validate what is already being validated when this method then proceeds to call setHeaders on its clone. The exception and the error description is identical.

If we change line 93 in MessageTrait to:

if (!\is_int($header) && (!\is_string($header) || '' === $header)) {
    throw new \InvalidArgumentException('Header name must be a non-emtpy string or an integer.');
}

Then it all seems to work out fine. At least it does in my environment. And we avoid using the same error in two different places. I'm slightly confused as to why there is validation when calling withAddedHeader() but not withHeader(). The input seems to have the same constraints? Maybe we could remove this check entirely in withAddedHeader()?

I've opened a PR that includes the string-cast, the is_int check I added (I did not change the error message though) and the test you wrote for numeric headers. The unit tests pass locally but I'm not sure what the other problems are: https://github.com/Nyholm/psr7/pull/149

Nyholm commented 4 years ago

Nyholm/psr7:1.3 is released now.

Zegnat commented 4 years ago

Fixed as of psr7-server version 0.4.2.