libwww-perl / URI

The Perl URI module
https://metacpan.org/pod/URI
Other
57 stars 48 forks source link

Characters that cannot be encoded with query_param #67

Open ghost opened 4 years ago

ghost commented 4 years ago

The following strings are not encoded in query_param. ! * () ' The code is shown below.

perl -le '
    use strict;
    use URI;
    use URI::QueryParam;

    my $domain = q{https://hoge.com};
    # chr(hex(27)) is single quote
    my $param = {foo => q{!*()}.chr(hex(27))};

    my $uri = URI->new($domain);
    $uri->query_param( %$param );
    print "encoded by URI::QueryParam: ".$uri->as_string;
';
encoded by URI::QueryParam: https://hoge.com?foo=!*()'

Shouldn't it be encoded if it conforms to RFC3986? Or is there a way to encode the above characters using query_param?

haarg commented 1 year ago

I don't see anything in RFC3986 which would require those characters to be percent encoded.

The URL spec is possibly a more useful reference. The query part of a URL is not required to have the characters !, (, ), or * encoded. But when producing a query string, the spec states:

The application/x-www-form-urlencoded percent-encode set contains all code points, except the ASCII alphanumeric, U+002A (*), U+002D (-), U+002E (.), and U+005F (_).

According to this, !, ), and ( should be encoded when using query_param, but * should not be.

kartiksubbarao commented 1 year ago

@haarg In the URL spec, I see that ' should be percent-encoded as well, for http, https, file, ws, and wss URLs (these are called special schemes by the spec):

The special-query percent-encode set is the query percent-encode set and U+0027 (').