sergot / http-useragent

Web user agent class for Perl 6.
MIT License
36 stars 39 forks source link

HTTP::Cookies cannot parse cookies having `-` in names #154

Closed astj closed 7 years ago

astj commented 7 years ago

Hi.

I found HTTP::Cookies cannot parse some cookies. Here're snippets to reproduce the problem. Cookies having - in names are not parsed.

> $c = HTTP::Cookies.new; $c.set-cookie('Set-Cookie: mykey=myvalue;'); $c.say;
HTTP::Cookies.new(cookies => [HTTP::Cookie.new(name => "mykey", value => "myvalue", secure => Bool::False, httponly => Bool::False, path => Any, domain => Any, version => Any, expires => Any, fields => {})], file => Any, autosave => 0)
> $c = HTTP::Cookies.new; $c.set-cookie('Set-Cookie: mykey=my-value;'); $c.say;
HTTP::Cookies.new(cookies => [HTTP::Cookie.new(name => "mykey", value => "my-value", secure => Bool::False, httponly => Bool::False, path => Any, domain => Any, version => Any, expires => Any, fields => {})], file => Any, autosave => 0)
> $c = HTTP::Cookies.new; $c.set-cookie('Set-Cookie: my-key=my-value;'); $c.say;
HTTP::Cookies.new(cookies => [], file => Any, autosave => 0)

In current implementation, HTTP::Cookies::Grammer defines name as \w+. It seems the problem is due to this implementation. According to RFC6265, Cookie name is RFC2616's token. and the RFC defines token as follows:

       CHAR           = <any US-ASCII character (octets 0 - 127)>
       CTL            = <any US-ASCII control character
                        (octets 0 - 31) and DEL (127)>
       token          = 1*<any CHAR except CTLs or separators>
       separators     = "(" | ")" | "<" | ">" | "@"
                      | "," | ";" | ":" | "\" | <">
                      | "/" | "[" | "]" | "?" | "="
                      | "{" | "}" | SP | HT

In my understand, following characters are accepted as token's elements:

> (32...126).map({chr($_)}).grep({!' ()<>@,;:\"/[]?={}'.index($_).defined}).join
!#$%&'*+-.0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ^_`abcdefghijklmnopqrstuvwxyz|~

And those characters seems to be RFC-accepted but HTTP::Cookie::Grammer doesn't accept.

> (32...126).map({chr($_)}).grep({!' ()<>@,;:\"/[]?={}'.index($_).defined}).grep({$_ !~~ /\w/ }).join
!#$%&'*+-.^`|~

I don't know HTTP::Cookies::Grammer accepts all those characters as name, but I think it's okay if it accepts some characters like -.

Example

Perl5's Cookie::Baker escapes name on baking cookies when it matched m![^a-zA-Z\-\._~]! : https://metacpan.org/source/KAZEBURO/Cookie-Baker-0.07/lib/Cookie/Baker.pm#L36

ugexe commented 7 years ago

Its worth noting that an RFC based grammar for this is implemented in Grammar::HTTP:

nickl@li685-90:~/perl6$ perl6 -MGrammar::HTTP -e '\
> my $parser = Grammar::HTTP.new;
> my $cookie-string = "Set-Cookie: my-key=my-value";
> say $parser.parse($cookie-string, :rule<set-cookie-header>)
> '

「Set-Cookie: my-key=my-value」
 Set-Cookie => 「my-key=my-value」
  cookie-pair => 「my-key=my-value」
   cookie-name => 「my-key」
   cookie-value => 「my-value」
jonathanstowe commented 7 years ago

I was womdering that too

jonathanstowe commented 7 years ago

It is probable that if it did use HTTP::Grammar then we can close the #142 as well.

jonathanstowe commented 7 years ago

In the spirit of agility however I think that I'll fix this without Grammar::HTTP, then attempt to implement that separately.

jonathanstowe commented 7 years ago

I've adjusted the token to be somewhat more like the spec (i.e. any non white space character except one of the separators,) and it passes a test based on your example above.

astj commented 7 years ago

Thanks!