whitequark / rack-utf8_sanitizer

Rack::UTF8Sanitizer is a Rack middleware which cleans up invalid UTF8 characters in request URI and headers.
MIT License
314 stars 53 forks source link

Add sanitation of cookie header #45

Closed jphager2 closed 6 years ago

jphager2 commented 6 years ago

Cookies can be url encoded. Since they are not included in the list of URI_FIELDS, url encoded invalid UTF8 in cookies are not sanitized. However, just adding HTTP_COOKIE to URI_FIELDS won't work, because in order to safely replace invalid UTF8, we need to first split the cookie header. The regular expression used here is the same used by rack when parsing the cookie.

@whitequark, please let me know you thoughts about this addition.

whitequark commented 6 years ago

because in order to safely replace invalid UTF8, we need to first split the cookie header

What happens if you don't?

Can you show me the Rack code that parses the cookie?

jphager2 commented 6 years ago

Basically the issue is that bad UTF8 can get through the sanitizer if it is url encoded in the cookie. The sanitizer won't catch it, and the application might fail if it does something with that input.

Here is where the cookies are parsed in rack: https://github.com/rack/rack/blob/master/lib/rack/utils.rb#L215

The ';,' corresponds to the matching regexp here: https://github.com/rack/rack/blob/master/lib/rack/query_parser.rb#L6

whitequark commented 6 years ago

Sorry, I wasn't clear. What if you don't split the cookie header?

jphager2 commented 6 years ago

Ah yeah. For that there is two things:

  1. The entire cookie header is not url encoded (only the values)
  2. Cookie values might contain encoded semicolons or commas (the cookie delimiter)

So because of 1), it makes sense to split before decoding/sanitizing and then join back. We could also decode, sanitize and then not reencode, but then you risk breaking the cookie delimitation if there is an encoded delimiter in one of the cookie values.

whitequark commented 6 years ago

Thanks for explanation. It's worth putting this in a comment.

jphager2 commented 6 years ago

Ok cool, will do (probably Monday).

jphager2 commented 6 years ago

@whitequark Comment added. Does it look good now?

whitequark commented 6 years ago

Thanks!