ioseb / uri-template

PHP extension implementation of RFC-6570(URI Template) in C - https://datatracker.ietf.org/doc/html/rfc6570
http://pecl.php.net/package/uri_template
Other
73 stars 3 forks source link

URL encodes values that are already encoded #1

Closed mtdowling closed 12 years ago

mtdowling commented 12 years ago

uri_template will url encode values that are already encoded and are not inside of a template expansion.

php -a
php > echo uri_template('http://foo.com/baz?bar=bam_%21', array());
http://foo.com/baz?bar=bam_%%2

And strangely, the parser does not URL encode the value when it is unencoded:

php > echo uri_template('http://foo.com/baz?bar=bam_!', array());
http://foo.com/baz?bar=bam_!

I think the correct behavior would be to not touch things outside of URI template expansion blocks.

ioseb commented 12 years ago

@mtdowling

Thanks for feedback!

php -a
php > echo uri_template('http://foo.com/baz?bar=bam_%21', array());
http://foo.com/baz?bar=bam_%%2

Example above looks like a bug i'm sure it needs to be copied as "bam_%21" and not "%%2" it doesn't look like doubly encoded value it looks like incorrectly copied triplet, will investigate this.

This one:

php > echo uri_template('http://foo.com/baz?bar=bam_!', array());
http://foo.com/baz?bar=bam_!

Is correct i think. Spec says that:

2.1.  Literals
   The characters outside of expressions in a URI Template string are
   intended to be copied literally to the URI reference if the character
   is allowed in a URI (reserved / unreserved / pct-encoded) or, if not
   allowed, copied to the URI reference as the sequence of pct-encoded
   triplets corresponding to that character's encoding in UTF-8
   [RFC3629].

     literals      =  %x21 / %x23-24 / %x26 / %x28-3B / %x3D / %x3F-5B
                   /  %x5D / %x5F / %x61-7A / %x7E / ucschar / iprivate
                   /  pct-encoded
                        ; any Unicode character except: CTL, SP,
                        ;  DQUOTE, "'", "%" (aside from pct-encoded),
                        ;  "<", ">", "\", "^", "`", "{", "|", "}"

Maybe i'm misinterpreting something? but "!" is one of the literals which needs to be copied directly(additionally see Reserved characters section of URI spec http://tools.ietf.org/html/rfc3986#section-2.2)

mtdowling commented 12 years ago

Thanks. Yeah, The second example does the right thing and just an example was just me showing that you're not url encoding stuff, so there must be another issue in the parser.

ioseb commented 12 years ago

Okay, will try to resolve the issue as quickly as possible. Thanks again for the bug report.

ioseb commented 12 years ago

@mtdowling

Fixed the bug with URI encoded triplets, additionally fixed copy issue with utf8 characters. Release is published on pecl: http://pecl.php.net/package/uri_template

mtdowling commented 12 years ago

Nice!