sergot / http-useragent

Web user agent class for Perl 6.
MIT License
37 stars 39 forks source link

URI::Escape is inappropriate for escaping values for x-www-form-urlencoded #219

Closed dmaestro closed 5 years ago

dmaestro commented 5 years ago

It seems that values containing spaces are percent-encoded, when they should be replaced by '+'. Unfortunately, x-www-form-urlencoded does not mean the same as rfc 3986: https://useyourloaf.com/blog/how-to-percent-encode-a-url-string/

Encoding for x-www-form-urlencoded

The W3C HTML5 recommendation for encoding form data is similar but ever so slightly different from RFC 3986. Section 4.10.22.5 gives us the characters not to percent encode:

ALPHA / DIGIT / “*” / “-” / “.” / “_”

You should also replace the space (” “) character with a “+” (0x2B). Note the differences with RFC 3986 as described in this Stack Overflow[1] answer.

[1] http://stackoverflow.com/a/24888789

The following ugly hack works for the use case I have (spaces), but other characters might need to be specially addressed as well. Perhaps a module for this specific encoding?

> my $encoded_uri_component = uri-escape('some text/to encode+ safely').subst(:g, '%20', '+');
some+text%2Fto+encode%2B+safely
ugexe commented 5 years ago

Resolved by https://github.com/sergot/http-useragent/pull/220