ikod / dlang-requests

dlang http client library inspired by python-requests
Boost Software License 1.0
154 stars 32 forks source link

Make header parsing more robust #103

Closed sarneaud closed 5 years ago

sarneaud commented 5 years ago

The old code tries to handle both (correct) "\r\n" line separators and (non-compliant) "\n" line separators, but breaks and throws uncatchable RangeErrors on some servers with strange responses. For example, https://runtimesharks.com/ ends header lines with the proper "\r\n", but includes a Content-Security-Policy header with '\n' characters in it.

This rewrite only splits on "\n" if the HTTP status line ends with "\n" and not "\r\n". I've only found one server that breaks with this rule: rc-httpd, which uses "\r\n" for all lines, except for the Content-Length header, which ends in "\n". (Try http://doc.cat-v.org for an example.) That's broken, but I've included a tweak that makes the Content-Length parsing more forgiving. (It still loses the next header line, but there's only so much that can done to save broken servers. Other clients don't handle these weird servers perfectly, either.)

As a plus, the new parsing heuristic is safer against header injection. Header injection won't work with a server that only uses and sanitises the standard "\r\n" line ending.

ikod commented 5 years ago

Hello @sarneaud

Thanks for PR, will check details today!

ikod commented 5 years ago

Hello @sarneaud

I agree that header parsing can be improved with this PR, but I have question. As I can understand - you expect that all headersEnd characters were received during last read operation. But this assumption can be wrong. Can you, please, test this code with 'Request.bufferSize = 1;`? It should not fail.

Thanks!

sarneaud commented 5 years ago

As I can understand - you expect that all headersEnd characters were received during last read operation.

Only the last-read data is scanned for new '\n' characters, but the full buffer is used in the end-of-header check: if ( buffer.data[0..idx+1].endsWith(headersEnd) )

As you asked, I just ran a test with bufferSize of 1 with about 1000 URLs sampled from the public web. It worked.

ikod commented 5 years ago

Thanks!