Open blessingcharles opened 2 years ago
Thanks for this!
Content_Length header got accepted as content-length , this will definitely lead to http request smuggling
Although I understand why this is suboptimal, as far as I can tell, there's no spec which forbids this. There is a switch for this in nginx
: https://nginx.org/en/docs/http/ngx_http_core_module.html#underscores_in_headers and also in HTTP::Headers
:
https://metacpan.org/release/OALDERS/HTTP-Message-6.36/source/lib/HTTP/Headers.pm#L10-12
But accepting multiple content length headers with different values is pretty bad, as it was previosly exploited using http request smuggling by combining a proxy , which treats content_length and content-length as different header and forwards them . But http-daemon treats both of them as same and also multiple content-length with different values were also accepted .
RFC About multiple Content-length with different values in the 4th point , the server must reject it with 400 error .
Right, I'm not saying we shouldn't address this -- I just wasn't aware of any RFC forbidding the use of underscores. Thanks for the link.
Will there be any CVE for this , as multiple content-length and instead of hyphen , underscore get accepted in content-length .
Just to add some context for the underscore thing. This was well before my time, but I did some digging to find out how we got here:
$TRANSLATE_UNDERSCORE
was at one point documented, but it was removed in 5.08: https://metacpan.org/dist/libwww-perl/source/Changes#L927
Previously it had read:
To make the life easier for perl users who wants to avoid quoting before the => operator, you can use '_' as a replacement for '-' in header names (this behaviour can be suppressed by setting the $HTTP::Headers::TRANSLATE_UNDERSCORE variable to a FALSE value).
That documentation was removed almost exactly 18 years ago. It looks like the context for this behaviour was around not wanted to quote keys in $h->header( $field => $value, ... )
.
Setting $HTTP::Headers::TRANSLATE_UNDERSCORE = 0;
would be a (clunky) way to disable this behaviour in the current version of the library.
But accepting multiple content length header with different field value is really bad and prone to HRS .
I'm not arguing that the current behaviour great. So far I'm just providing context for why the code is in its current state.
Still Investigating...
The underscores_in_header
is used to explicitly allow these headers, that otherwise would be ignored because of vulnerabilities.
The TRANSLATE_UNDERSCORE
package variable has a different purpose, it makes Foo_Bar
header the same as Foo-Bar
The latter behaviour is extremely dubious when dealing with incoming requests, According to the RFC, those are just different headers (unlike case-folding). One such use case is when working with good ole CGI base applications where you do not want to trip over CGI defined Environment Variables.
The translation is super helpful when creating a HTTP::Message
object, and using the =>
'fat-comma' to stringily the header name ... we should keep it there, However on incoming request where we read data from the socket, we ought to have been more cautious.
However, it does not hurt most of the times, and I doubt it would lead to any more risks of smuggling... as long as we ensure that both values are the same.
As part of the remedy, I would suggest to introduce a new HTTP::Daemon
option: underscores_in_headers
(or drop_underscores
) that would intercept the get_request
and filter out any headers that could be of a malicious source.
Unless there are other vulnerabilities know related to other headers, I think translating the _
to -
is 'harmless' and we should not drop those headers by default.
GIVEN that we do fix the other Content-Length
related issues:
chunked
GET / HTTP/1.1
host: localhost
Content_Length: 3
Content-Length: 22
abcGET /admin HTTP/1.1
The above mentioned request is a potential candidate of HRS . when a proxy thinks content_length as a X header and content-length as RFC standard CL header and downstream the request with a body of 22 bytes . But HTTP Daemon treats the first content_length as the RFC standard header with length of 3 bytes , thus leading to HRS .
HTTP Daemon treats the first content_length as the RFC standard header with length of 3 bytes , thus leading to HRS
This is currently true because of the way perl
handles arithmetic with strings, and that (with default TRANSLATE_UNDERSCORE = 1
) multiple occurrences of the same HTTP Header-field, the values are ,
comma-joined.
my $len = $r->header('Content-Length');
results in a string value of 3, 22
and the used that within arithmetics at line 292:
my $missing = $len - length($buf);
the value 3
will be used.
As I mentioned:
However, it does not hurt most of the times, and I doubt it would lead to any more risks of smuggling... as long as we ensure that both values are the same.
and
GIVEN that we do fix the other Content-Length related issues:
• multiple but different values
In other words: we should fix how we compute $len
and drop the connection when we have multiple but different values
Will there be any CVE for this , as this falls under CWE 444
Previous CVE's
1. https://cwe.mitre.org/data/definitions/444.html 2. https://www.cvedetails.com/vulnerability-list/cweid-444/vulnerabilities.html
Possible attack scenarios are explained in CWE 444
Will there be any CVE for this , as this falls under CWE 444
I will have a look into that in the morning, I'm unfamiliar with that process, but one it should be your first, right ?
Yeah , This is my first CVE . I found this while testing on various proxies and servers for parsing discrepancies which may lead to several security considerations for a course in my undergraduate degree . I created a protocol fuzzer tool namely BuzzShock for finding these kinds of discrepancies bugs .
@blessingcharles:
Will there be any CVE for this , as this falls under CWE 444
using Github builtin Security Advisories tool, a new CVE is requested
It looks like I forgot to include some testing of how things should behave: https://gist.github.com/genio/24c76f2ceab91f7509fb72415aafbd09
@vanHoesel While we should be wanting a positive integer, things like +3
should be allowed. Mojo tends to rely on Scalar::Util's looks_like_number
- I think I'd prefer that over us rolling our own (even though it's not complex).
@genio, the commits in the Private Repo for the Security Advisory will only accept what is in the RFC, and reject anything else than a unsigned non negative number... as it shows in the RFC *Content-Length: 1DIGIT.** As much as I do hate causing breakage rejecting 'something positive' with the plus-sign (+
), I really see no valid reason that we should diver from the RFC. Mojolicious accepting it, is not a valid reason.
From the set of test you have done for Mojolicious, it looks to me that only the third block is doing the right thing.
Tomorrow, I'll try to figure out how to add tests to the new changes.
there is a fix for it available, and I created a Merge Request. However, I do not know how this all works with Dzil and git-hooks and other nice streamlined deployments.
@oalders can you take it from here ? please ?
@blessingcharles I am trying to understand why your POC 7 (FormFeed \x0c , BackSpace \x08 , Vertical tab \x0b got accepted before and after content-length values) would become an issue once we handle multiple Content-Length
Header-field values correctly (that is, non-negative all the same integers):
HTTP::Daemon
already deals with Content-Length/Transfer-Encoding and not have vulnerabilities regarding CWE-444 and request smuggling.
Perl solutions in general have a history of being lenient towards what they accept as input (and mostly) strict on what is produced as output. Accepting these 'whitespace' characters, being part of the \s
regular expression character-class is what makes this module do what it does, and be nice to others.
I can easily make the handling of incoming requests more strict by changing the delimiters used when extracting HTTP-Header files from at line 166:
/^([^:\s]+)\s*:\s*(.*)/
to
/^([^:\s]+):[ \t]*(.*)/
This will be more in line with RFC, where there is no space allowed between the header-field name and the colon (:
) and only 'Optional White Space' is allowed (which is defined as: *OWS = ( SP / HTAB )**.
When doing so, I could then eventually check what has been passed in as Header-field value and deal with the other \s
whitespace characters.
However, I do not feel comfortable doing so, because that would mean that HTTP::Headers
(used to push values onto a key) could potentially have 'dirty' characters that previously had been absent, potentially causing problems downstream.
@blessingcharles I am trying to understand why your POC 7 (FormFeed \x0c , BackSpace \x08 , Vertical tab \x0b got accepted before and after content-length values) would become an issue once we handle multiple
Content-Length
Header-field values correctly (that is, non-negative all the same integers):
HTTP::Daemon
already deals with Content-Length/Transfer-Encoding and not have vulnerabilities regarding CWE-444 and request smuggling.
The Transfer-Encoding header also accepting \x09 , \x0c and \x08 with the chunked value , these quirks might lead to HRS if both the server and downstreaming proxy disagree with this . You can refer to this research article https://portswigger.net/research/http-desync-attacks-request-smuggling-reborn , where they mention about this discrepancy problem .
Another interesting thing I noted , between header fieldname and colon multiple whitespaces are accepted but in RFC headers no whitespaces should be allowed between them .
RFC grammar
header-field = field-name ":" OWS field-value OWS
field-name = token
token = 1*tchar
tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / "." /
"^" / "_" / "`" / "|" / "~" / DIGIT / ALPHA
echo -ne "GET / HTTP/1.1\r\nHost: localhost\r\nTransfer-Encoding : \x0cchunked\r\n\r\n3\r\naaa\r\n0\r\n\r\n" | nc 192.168.1.21 8008
echo -ne "GET / HTTP/1.1\r\nHost: localhost\r\nTransfer-Encoding : \x09chunked\r\n\r\n3\r\naaa\r\n0\r\n\r\n" | nc 192.168.1.19 8008
echo -ne "GET / HTTP/1.1\r\nHost: localhost\r\nTransfer-Encoding : \x0bchunked\r\n\r\n3\r\naaa\r\n0\r\n\r\n" | nc 192.168.1.18 8008
HTTP::Daemon already deals with Content-Length/Transfer-Encoding and not have vulnerabilities regarding CWE-444 and request smuggling.
Yeah , There is no TE CL attacks are possible but CL CL attacks are currently possible which falls under CWE 444 , but after fixing [by rejecting multiple CL with different values and treating content_length as X header] it ,I think the major issue will be resolved from my point of view .
Another interesting thing I noted , between header fieldname and colon multiple whitespaces are accepted but in RFC headers no whitespaces should be allowed between them .
I was reading that, and would be very unpleasantly surprised to see HTTP-Header field names show up like HTTP-*-!Foo-Bar&Qux
- lets just not do that!
Waiting for approval from @simbabque and guidance from @oalders to get this issue sorted.
Another interesting thing I noted , between header fieldname and colon multiple whitespaces are accepted but in RFC headers no whitespaces should be allowed between them .
I was reading that, and would be very unpleasantly surprised to see HTTP-Header field names show up like
HTTP-*-!Foo-Bar&Qux
- lets just not do that!
Yeah it is surprise , but most of the webservers are parsing like it eg: nodejs , golang , actix . Accepting the above mentioned header field-name , but they are rejecting if a space is encountered between header-field name and colon with 400 status code according to the RFC .
On nodejs , golang , actix for different tchar characters , they are accepting it but rejecting if space presents
Random-header = fru.*\!*'&%.brf\t: nfrunib
echo -ne "GET / HTTP/1.1\r\nHost : localhost\r\nfru.*\!*'&%.brf\t: nfrunib\r\n\r\n" | nc localhost 8003
Waiting for approval from @simbabque and guidance from @oalders to get this issue sorted.
The work is going on in the background. 😄
Poor parsing of content-length header in httpdaemon will lead to http request smuggling
RFC security considerations
Libwwwperl parses poorly the content-length header , if a proxy downstream a request to Libwwwperl which violates RFC in parsing the content-length header will lead to http request smuggling , as it ambigiously parse the incoming request and contradicts with the downstreaming proxy
Examples:
Reference for http request smuggling
POC
Below here I attached all the poc for the discrepancies occured in httpdaemon in parsing the request which will lead to http request smuggling
Discrepancy in Parsing
1. Multiple Content-Length with different values got accepted , which is violation according to RFC ,and can lead to http request smuggling
POC
2. Failed to parse the Content-Length correctly
Eg : Content-Length: 3 3 , Content-Length: 3\r\n 1
POC
3. negative content-length field values got mapped to positive values differently
Eg : -3 = 1 , -2 = 2 , -1 = 3
POC
4. +ve sign got accepted before content-length field value
5. Content_Length header got accepted as content-length , this will definitely lead to http request smuggling
6. decimal values got accepted in content-length values
7. FormFeed \x0c , BackSpace \x08 , Vertical tab \x0b got accepted before and after content-length values
Content-Length: [prefix]3[suffix]
version 6.14
Code to Reproduce