raku-community-modules / URI

Raku realization of URI — Uniform Resource Identifiers handler
Artistic License 2.0
3 stars 14 forks source link

Cannot parse URI with query/fragment/path containing square brackets ([]) #47

Closed CIAvash closed 1 year ago

CIAvash commented 3 years ago

Although it works in the browser.

URI.new: 'https://httpbin.org/get?query[0]=test'
Could not parse URI: https://httpbin.org/get?query[0]=test
  in method parse at ... (URI) line 283
  in method new at ... (URI) line 381
JJ commented 1 year ago

I'm checking the grammar and I don't think square brackets are part of the pchar token that defines that part of the query. square brackets are gen-delim, (and it's defined as such at the grammar, which is a pretty loyal rendering of the original grammar), but not sub-delim which is what's in this OP. It might be a bit more informative, though, although I don't see how. So I'm for closing this issue, if no one else objects.

jonathanstowe commented 1 year ago

Yeah, I think the (quite hard to follow, because it is spread out so much,) grammar in the RFC precludes the [ and ] there because, as you say, they aren't in sub-delim. They probably should be "percent encoded" as (e.g.) %5B' and%5D`.

It's entirely possible that either the browser is encoding under the hood, or that everything is being more tolerant than the grammar in the RFC. https://httpbin.org/get?query%5B0%5D=test works fine everywhere.

JJ commented 1 year ago

It's entirely possible that either the browser is encoding under the hood, or that everything is being more tolerant than the grammar in the RFC. https://httpbin.org/get?query%5B0%5D=test works fine everywhere.

Browsers are notable for that kind of thing. You can put whatever you want in the URL slot, it will be encoded.

So... closing?

jonathanstowe commented 1 year ago

Yeah I think so. curl goes so far as to use the [] for its own use (glob the url,) and :

       -g, --globoff
              This  option switches off the "URL globbing parser". When you set this option, you can specify URLs that contain the letters {}[] without having curl itself interpret them. Note that these letters are not normal legal
              URL contents but they should be encoded according to the URI standard.