php / php-src

The PHP Interpreter
https://www.php.net
Other
38.23k stars 7.75k forks source link

PHP processing a truncated POST request #12343

Open ScottHelme opened 1 year ago

ScottHelme commented 1 year ago

Description

I've observed scenarios where PHP will read only a partial POST payload and begin processing it. The main scenario seems to be when Nginx times out on the fastcgi_pass to PHP, PHP will read whatever partial payload was written to the socket buffer and process it.


The following code can reproduce the issue:

<?php
sleep(30);
$json = json_decode(file_get_contents('php://input'));
if ($json == null) {
  syslog(LOG_WARNING, "JSON decode failed: " . $_SERVER['CONTENT_LENGTH'] . " " . strlen(file_get_contents('php://input')));
}


Send enough requests to exceed pm.max_children and cause a backlog. You can modify fastcgi_connect_timeout and fastcgi_send_timeout to a lower value than default to exacerbate the issue for testing.

I believe the issue is in the read() system call, which is returning 0 before expected, but PHP does not check the size of the POST payload that was read against the declared content-length.

https://github.com/php/php-src/blob/1c93cdcea430b77a28b4a552a350cd84484f7259/main/fastcgi.c#L970


The only validation of the size of the POST payload is in SAPI_POST_READER_FUNC(). This function initially checks the declared content-length does not exceed post_max_size, then reads the POST payload, and then checks the actual size of the POST payload that was read does not exceed post_max_size here.

https://github.com/php/php-src/blob/1c93cdcea430b77a28b4a552a350cd84484f7259/main/SAPI.c#L282-L285


The error message is also misleading and could be updated. It implies that the check could detect a POST length longer or shorter than the declared content-length with "does not match", but actually it will only detect a scenario where the POST length is longer than declared. An error message like this would be more accurate.

php_error_docref(NULL, E_WARNING, "Actual POST length exceeds Content-Length, and exceeds " ZEND_LONG_FMT " bytes", SG(post_max_size));


I created a patch for SAPI.c to demonstrate a possible behaviour and handling of the truncated POST payload that I might expect to see. This will check if the size of the POST payload read was smaller than the size declared in content-length and discard it if so.

--- SAPI.c.orig 2023-09-29 12:32:46.020740222 +0000
+++ SAPI.c      2023-09-29 20:28:36.530234632 +0000
@@ -288,6 +288,12 @@
                                break;
                        }
                }
+
+               if (SG(read_post_bytes) < SG(request_info).content_length) {
+                       php_stream_truncate_set_size(SG(request_info).request_body, 0);
+                       php_error_docref(NULL, E_WARNING, "POST length of " ZEND_LONG_FMT " bytes was less than declared Content-Length " ZEND_LONG_FMT " bytes; all data discarded", SG(read_post_bytes), SG(request_info).content_length);
+               }
+
                php_stream_rewind(SG(request_info).request_body);
        }
 }

PHP Version

8.2.10

Operating System

Ubuntu 22.04 (LTS) x64

ScottHelme commented 1 year ago

If anyone would like information on the story behind finding and investigating this behaviour, I've covered it in a two-part blog series so far:

Part 1: https://scotthelme.co.uk/unravelling-mystery-of-truncated-post-requests-report-uri/ Part 2: https://scotthelme.co.uk/processing-truncated-requests-php-debugging-deep-dive/

edit Part 3 is now published if anyone would like to read: https://scotthelme.co.uk/sockets-under-the-hood/

bukka commented 1 year ago

I'm aware of this issue and have got this note on my TODO list for some time:

I actually haven't created an issue for this yet so this is useful. I have got it quite high in terms of priority so will get to it in the next few months hopefully.

bukka commented 1 year ago

So I have been looking into this and from the code this logic seems to be on purpose.

It might be trying to follow this part of fcgi spec:

Next the Filter application receives CGI/1.1 stdin data from the Web server over FCGI_STDIN. The application receives at most CONTENT_LENGTH bytes from this stream before receiving the end-of-stream indication. (The application receives less than CONTENT_LENGTH bytes only if the HTTP client fails to provide them, e.g. because the client crashed.)

It kind of suggest that receiving less is possible / acceptable. There is however a recommendation (should) that states

A Filter should compare the number of bytes received on FCGI_STDIN with CONTENT_LENGTH and on FCGI_DATA with FCGI_DATA_LENGTH. If the numbers don’t match and the Filter is a query, the Filter response should provide an indication that data is missing. If the numbers don’t match and the Filter is an update, the Filter should abort the update.

It's not a complete requirement (should) but I think we follow it by default as it suggests that there is something not right with the client. The problem is that there might be actually clients that misbehave and things might still work for users. So this will require proper testing of all clients to be sure we didn't break anything. At least we should test the most used ones and introduce it only in master so it's kind of more a feature. I still think that it's kind of bug but will mark it as a feature as it's more a change of the behavior.

I actually realised that we already discussed it in https://github.com/php/php-src/pull/7509 . That PR is a good example of the client (nginx) that does not send any content length. Thinking about that one, it might make sense to allow missing content length together with this change so we don't introduce connection abort for chunked encoding. I think it also makes sense to add configuration to restore current behaviour in case there are some clients that would break otherwise. Just some sort of back up.

UCIS commented 5 months ago

I think we ran into the same problem using Apache+PHP FPM. I was able to reproduce the error using some shell code:

(echo -en 'POST /test.php HTTP/1.0\r\nHost: localhost\r\nContent-Length: 40000\r\nContent-Type: application/x-www-form-urlencoded\r\n\r\ntest=test&abcd='; dd if=/dev/urandom bs=1 count=8k | hexdump -v -e '/1 "%02X"'; sleep 1000) | nc localhost 80

This results in truncated data in the $_POST variable after a timeout. Even if the command is terminated, the PHP script still runs with the incomplete input (it might need to be logged to a file to show this behavior).

I think this happens because PHP might incorrectly handle an end-of-file condition on the FCGI input. The FCGI specification I found states that all streams (including STDIN) are normally terminated by an empty record, containing no data. I am not sure of the intended behavior in case this empty record is missing, but the stream should probably be considered as incomplete. I have confirmed with tcpdump that Apache does not send the empty record in case of a time-out, but instead just closes the connection.

While there is an explicit ABORT_REQUEST message in the FCGI specification, the Apache developers chose not to use this because it would be mostly useful in situations where multiple requests are multiplexed over a single FCGI connection, which seems to be rarely used.

PHP should probably check for the empty data record on STDIN, and if it has not been received before the end-of-stream on the FCGI connection, either abort the request entirely, or at least consider it as an aborted request and set the connection_status accordingly so the application can detect the aborted connection. #10106 might be related.