basho / riak_api

Riak Client APIs
27 stars 46 forks source link

Allow for HTTP receive buffer to be configurable #123

Closed martinsumner closed 5 years ago

martinsumner commented 5 years ago

In the move from Riak 2.0. and Riak 2.1., there were some significant changes to the internal handling of HTTP in mochiweb.

In Riak 2.0. - Mochiweb when handing a HTTP request had its own logic for handling the receipt of the HTTP request headers from the buffer. Following the move to 2.1. mochiweb depends more on the underlying erlang inet module.

A consequence of this change, is that in 2.0.* if when parsing headers, the receiver buffer was emptied with a binary that did not terminate with a line-feed, the header entry was assumed to be larger than the buffer, and the receive loop would continue to iterate over the whole header that had originally been truncated by the end of the buffer: https://github.com/basho/mochiweb/blob/1.5.1p6/src/mochiweb_http.erl#L136-L192

Now though, if an individual header entry is received which is bigger than the buffer, the socket server will return a tcp_error emsgsize, and mochiweb will return a 400 bad request, without logging the reason why: https://github.com/basho/mochiweb/blob/v2.9.0p2/src/mochiweb_http.erl#L75-L77.

This causes problems in Riak when secondary index terms become large (e.g. for the NHS when we hit a record a person with hundreds of previous names and addresses). The records can not then be PUT.

It would be better if riak_api had a cuttlefish configuration option to set the recbuf, and have that passed through webmachine to mochiweb. This will allow for the NHS to override this value, and set a lager receive buffer to handle larger headers. Rather than a vague 400 error - the specific 400 error or this issue should be returned (431): https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/431

Likewise, it may also be preferable for the MAX_REQUESTS constant to be configurable, although this is not an immediate issue for the NHS.

martinsumner commented 5 years ago

Some more details in this issue:

https://github.com/mochi/mochiweb/pull/208

There are two buffer settings which may be relevant. We've confirmed in testing that setting the recbuf to undefined does indeed cause the erlang user space buffer to be reset to 1460, restricting HTTP header lines even further.

martinsumner commented 5 years ago

There are a number of PRs required to resolve this (pulls onto develop-2.9 branches):

https://github.com/basho/riak_api/pull/124 https://github.com/basho/riak_pb/pull/232 (this resolves a change missing from original RC1, unrelated to this issue) https://github.com/basho/webmachine/pull/14 https://github.com/basho/mochiweb/pull/26

@martincox just to inform you that these changes are pending to be merged in as part of RC2 of the 2.9 release to resolve this issue. The issue was actually caused back in 2.1, but blocks the NHS progressing to 2.9. It allows for the receive buffer to be changed in advanced.config so a system which sends large http headers can increase this buffer, and allow for larger headers than the 8KB default.

I reviewed this with @ramensen this morning. I will merge in tomorrow to allow me to put out RC2.

martinsumner commented 5 years ago

https://github.com/basho/riak_api/pull/124