samscott89 / serde_qs

Serde support for querystring-style strings
Apache License 2.0
193 stars 68 forks source link

More percent decoding issues #13

Closed kardeiz closed 5 years ago

kardeiz commented 7 years ago

Actually... there was a reason percent decoding was performed early in earlier versions. If the keys contain encoded square brackets, they need to be decoded before parsing.

For example, filters%5Btags%5D%5B0%5D=Chess won't be decoded to any reasonable representation.

So, in my limited understanding:

samscott89 commented 7 years ago

Ah no, of course... That's a pain. I'll revisit this. The old way still isn't going to work, but I've got an idea for how the solution might look. Basically, need to squeeze the urldecode in between checking the next character for '&' or '=', and the key/value parsing logic. Since most things here are iterators it might compose nicely.

samscott89 commented 7 years ago

Okay, coming back to this, the new deserializer is correct. It's the serializer which is wrong. Square brackets should not be encoded. As can be seen in this ruby example:

Rack::Utils::build_nested_query({"filter" => {"tags" => ["chess"]}})
 => "filter[tags][]=chess" 

Rack::Utils::build_nested_query({"filter" => {"tags" => ["chess & things[]"]}})
 => "filter[tags][]=chess+%26+things%5B%5D" 

Time to refactor the serializer code!

kardeiz commented 7 years ago

If I'm understanding correctly, I think I would disagree with this conclusion. I think it would be an unnecessary limitation to only "correctly" parse querystrings produced by this library (or by other libraries that follow the same practice of not encoding square brackets).

Rack, for example, I think does the right thing:

Rack::Utils.parse_nested_query("filters%5Btags%5D%5B0%5D=Chess")
=> {"filters"=>{"tags"=>{"0"=>"Chess"}}}

Of course, you might wish to also modify the serializer to not encode square brackets, but I think that would be a separate issue.

samscott89 commented 7 years ago

Ha, interesting. Problem is, querystrings aren't formally defined. So you can get this behaviour:

Rack::Utils::parse_nested_query(Rack::Utils::build_nested_query({"filters" => {"[tags]" => ["Chess"]}}))
 => {"filters"=>{"tags"=>["Chess"]}} 

Where the roundtrip is actually lossy...

It does appear as though your initial assessment was therefore correct, at least for compatibility with Rack. With the above result, that if the keys contain square brackets, then you get potentially strange behaviour.

I think it would be an unnecessary limitation to only "correctly" parse querystrings produced by this library (or by other libraries that follow the same practice of not encoding square brackets).

I think the answer might be a "strict" mode, which adheres to a precisely defined spec (to be defined), and then the other mode would initially decode any brackets. The latter results in just a single line to replace the brackets (leaving, for example, '&' and '=' encoded when in the strings).

samscott89 commented 7 years ago

Okay. Hopefully this series of commits solves all the issues. We have:

kardeiz commented 7 years ago

This looks great! I've confirmed that the non-strict version works as expected for my application.

I see your point and understand that strict mode is inherently more "complete", since it allows a way to include square brackets that are not meant to indicate nesting. But since querystrings can be generated by any number of external sources, I do think it is important to support both modes, as you've done here.

Thanks so much for your explanations and speed!

samscott89 commented 7 years ago

Agreed! And thanks again for using the library to get some real-world feedback. Very useful!

kardeiz commented 6 years ago

Sorry to re-open this, but I think something is still wrong; square brackets in the value position are causing some issues in the deserialization process.

For example, if I add this test:

map_test!("foo=%5BHello%5D", "foo"["[Hello]"]);

it fails with:

---- qs_test_simple stdout ----
    thread 'qs_test_simple' panicked at 'assertion failed: `(left == right)`
  left: `{"foo": "[Hello]"}`,
 right: `{"foo": "[Hell"}`', tests/test_deserialize.rs:178:4

This isn't a priority for me right now (please don't worry about rushing a fix), I just wanted to let you about it. Thanks, and happy new year!

samscott89 commented 6 years ago

You caught me at a good time.. Just trying to get back into some Rust coding after a busy Christmas period... Seemed like a nice gentle place to start. I've tweaked the logic slightly, and published it provisionally as 0.4.0-rc2. Basically, now the non-strict mode only applies to the parsing of keys, which seems like a better idea.

Happy new year to you too!

samscott89 commented 5 years ago

This should have been fixed since https://github.com/samscott89/serde_qs/commit/93d0f47e2aa0fc31a196726163001e31f4824722, and included in versions >= 0.4.