nodejs / http-parser

http request/response parser for c
MIT License
6.33k stars 1.53k forks source link

count the URL and headers separately when checking for HPE_HEADER_OVERFLOW errors #463

Open caiolrm opened 5 years ago

caiolrm commented 5 years ago

Counting headers, URL and start lines separately because it was counting everything as headers before the headers_done state, throwing HPE_HEADER_OVERFLOW errors even when what really caused the overflow was the request URL being too long

Creating an HPE_URL_OVERFLOW err to make sure embedders still have a request line limit, but now differentiating what went over such limit.

sam-github commented 5 years ago

I think it would be useful to refer to what is being limited with the names from RFC's BNF.

What's the "start line"? Nothing in RFC 2616 is called that.

Request-Line has more than a Request-URI, it also has Method, HTTP-Version, and whitespace, any of which could be unreasonably long when written maliciously.

Does this measure URI completely separately from the other elements of the Request-Line? If so, is there some other limit on the Request-Line size?

caiolrm commented 5 years ago

@sam-github

What's the "start line"? Nothing in RFC 2616 is called that.

Take a look at the section 4.1 of RFC 2616, it briefly mentions it, but there's more detailed info regarding that in RFC 7230 section 3 and 3.1.

Request-Line has more than a Request-URI, it also has Method, HTTP-Version, and whitespace, any of which could be unreasonably long when written maliciously.

Does this measure URI completely separately from the other elements of the Request-Line? If so, is there some other limit on the Request-Line size?

This does measure URI completely separately and you're absolutely correct, there should be a limit for the request line size seeing I changed the way it was validated before. Guess I was too focused on the URI thing and forgot about the rest. Maybe I should add it here:

https://github.com/nodejs/http-parser/blob/3a5a436499bcb8a47f16fc2c3ae999e9ad9eb6d5/http_parser.c#L740-L742

And throw a new error like HPE_START_LINE_OVERFLOW, also have something similar to HTTP_MAX_HEADER_SIZE and HTTP_MAX_URL_SIZE macros like a HTTP_MAX_START_LINE_SIZE. How does that sound?

@bnoordhuis Should I do something like this:

#define HTTP_METHOD_LENGTH_MAP(XX) \
  XX(0,  6)                        \
  XX(1,  3)                        \
  XX(2,  4)                        \
  XX(3,  4)                        \
  XX(4,  3)                        \
  XX(5,  7)                        \
  XX(6,  7)                        \
...

instead and then have the method_lengths declared using it? Like

static const uint8_t method_lengths[] =
  {
#define XX(num, length) length,
  HTTP_METHOD_LENGTH_MAP(XX)
#undef XX
  };

that would probably decrease the impact of this change.

ploxiln commented 5 years ago

you may be able to do something like:

static const uint8_t method_lengths[] = 
  {
#define XX(num, name, string) uint8_t(sizeof(#string) - 1),
  HTTP_METHOD_MAP(XX)
#undef XX
  }
ploxiln commented 5 years ago

(but there may be a slightly less expensive way to just error-out if the request line overall is too big ...)

bnoordhuis commented 5 years ago

@ploxiln's suggestion works but it's still possible to exceed the overall limit. This PR stores state in stack locals url_offset and spaces_before_url but that information is lost when the input is spread over multiple http_parser_execute() calls.

caiolrm commented 5 years ago

@bnoordhuis true. I can probably do away with url_offset, it's just being used for comparison. Gonna try to figure something out for the spaces, though, since currently, we can have multiple before the actual url.