Open nox opened 3 years ago
I'm not sure this is a good first issue, but there are some parsing utilities in https://github.com/http-rs/http-types/blob/main/src/parse_utils.rs for whomever takes this up. Any header that includes quoted-strings or tokens should be implemented with a parser instead of splitting
Would introducing another dependency be an option, in that case quoted-string seems like an ideal candidate to avoid duplicate work. No nested dependencies present and works both ways it seems.
What would the advantage of quoted-string be over the http-types parse-utils, which we wrote specifically for this?
What would the advantage of quoted-string be over the http-types parse-utils, which we wrote specifically for this?
Point taken, missed that implementation. There is still a bit of duplicated work being done in that code though, using the dependency would mean you can "outsource" that part. Outsourcing has its own benefits and downsides of course.
I've been looking at this issue but am a bit confounded with how the Accept / Header types are working, it's surprising to me the Header is just a Trait and not a real Struct. Guess it's easier to initialize a tuple instead of having to use an exact/imported Header struct type? It does also mean we're sidestepping some of the good practices that come from the Rust strict type system? It feels a bit of duck-typing to me?
That means that
Accept: foo/bar; parisien="baguette, jambon, fromage"
will parse incorrectly.
@nox is there another more real-life example you can provide for values of Accept ? I've not seen such header styles and can't really think of cases that aren't just documentation. Regardless this is actually quite a dangerous thing as we're handling/parsing possibly rogue/bad actor input.
@red15 I'm not sure how Header would work as a "real struct" given that each header, once parsed, contains different structured data. It's not duck typing at all, but the trait system at work. There's a library of structs that all conform to a uniform trait that provides affordances for them to parse data out of headers (the definition of which includes concrete structs like Headers, HeaderName, HeaderValues, and HeaderValue) and/or insert structured data into Headers. Importantly, Accept is itself a plain old rust struct: https://github.com/http-rs/http-types/blob/b3a9b9615936adc6727027a00c4b107c0a9ba875/src/content/accept.rs#L51-L54, which could contain any data that is appropriate for the specific header. Where do you see a tuple?
As far as using external libraries, I agree we could have used an external parsing library, but the decision was made to keep parsing logic internal to http-rs crates. I can't fully explain the motivations behind that decision, but it was why I wrote the parse_utils parsers from the http spec for the Forwarded header implementation, which I believed was especially important to do correctly for similar security-related reasons to those you cite.
I'm mostly referring to the tests that use the Trait to extract header_name and header_value. https://github.com/http-rs/http-types/blob/85409417e57e9f61b0d6b40d081bdbaa8f385d07/src/headers/header.rs#L49-L51
Isn't this where the problem with quoted-string parsing would occur?
I asked nicely and now this library has a parser for this you can use: https://github.com/picoHz/mediatype/issues/4 https://docs.rs/mediatype/latest/mediatype/struct.MediaTypeList.html
Found an example of a simple test that should be added to demonstrate this problem:
Simply adding the line
// Handle multiple mime with comma and qualifiers
assert_parse("text/plain; q=0.5, text/html, text/x-dvi; q=0.8, text/x-c", "text/plain; q=0.5, text/html, text/x-dvi; q=0.8, text/x-c");
would then fail the test with:
---- mime::parse::whatwag_tests stdout ----
thread 'mime::parse::whatwag_tests' panicked at 'assertion failed: `(left == right)`
left: `"text/plain;q=\"0.5, text/html, text/x-dvi\""`,
right: `"text/plain; q=0.5, text/html, text/x-dvi; q=0.8, text/x-c"`', src/mime/parse.rs:315:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
That should demonstrate the parsing of the mime type is not properly done, unless my test needs tweaking but the "left" value of this Mime type really doesn't look like what we've inputted to me?
As a side node, perhaps it's worth taking off the tag good first issue
as well :)
RFC 7231 section 5.3.2 says:
Note that
accept-ext
accepts aquoted-string
after the equal sign, which means the values can include commas.The parsing code for
Accept
splits theAccept
header values on comma and then parses each segment withMediaTypeProposal::from_str
.That means that
Accept: foo/bar; parisien="baguette, jambon, fromage"
will parse incorrectly.