croservices / cro-core

The heart of the Cro library for building distributed systems in Raku, including pipeline composition and TCP support.
https://cro.services/
Artistic License 2.0
27 stars 9 forks source link

Unable to parse URI from `Cro::Uri::HTTP` #8

Closed benjif closed 3 years ago

benjif commented 6 years ago
⚠ bblog Unable to parse URI '/user/password?name[%23post_render][0]=printf&name[%23markup]=ABCZ%0A': malformed syntax
⚠ bblog   in method parse-request-target at /opt/rakudo-pkg/share/perl6/site/sources/0B03B8A31325388D365143627EE2060292457C4A (Cro::Uri::HTTP) line 37
⚠ bblog   in method ensure-cached-uri at /opt/rakudo-pkg/share/perl6/site/sources/D2E3C530EF44DE32A8B899E2F096321377ADB399 (Cro::HTTP::Request) line 93
⚠ bblog   in method path at /opt/rakudo-pkg/share/perl6/site/sources/D2E3C530EF44DE32A8B899E2F096321377ADB399 (Cro::HTTP::Request) line 63
⚠ bblog   in block  at /opt/rakudo-pkg/share/perl6/site/sources/7C719666D0F678DBDE30BEFAA802017E80D2DF2C (Cro::HTTP::Router) line 217
⚠ bblog   in block  at /opt/rakudo-pkg/share/perl6/site/sources/32DA898BE7D88D49C2A69159C9F9C42E70E61E30 (Cro::HTTP::Internal) line 22
⚠ bblog   in block  at /opt/rakudo-pkg/share/perl6/site/sources/204CE395C606542C93B4CD46D5A2A5AB27722A5B (Cro::HTTP::RequestParser) line 119
⚠ bblog   in block  at /opt/rakudo-pkg/share/perl6/site/sources/204CE395C606542C93B4CD46D5A2A5AB27722A5B (Cro::HTTP::RequestParser) line 52
⚠ bblog   in block  at /opt/rakudo-pkg/share/perl6/site/sources/E0EB5AF82A4A7046486A0257DAC5ED54BE7B817D (Cro::TCP) line 53
⚠ bblog                  
jnthn commented 6 years ago

The URI RFC defines the allowed characters in a query string as:

query       = *( pchar / "/" / "?" )

Where pchar is defined as:

pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
pct-encoded   = "%" HEXDIG HEXDIG
sub-delims    = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="

The query string here contains unencoded [ and ] characters, which are now allowed there. Unfortunately, however, there's a lot of non-compliant URIs around, and the RFC pretty much amounts to "just cope":

If a reserved character is found in a URI component and no delimiting role is known for that character, then it must be interpreted as representing the data octet corresponding to that character's encoding in US-ASCII.

So, for the sake of non-compliant producers, we should loosen things up a bit.

jnthn commented 3 years ago

Resolved by #30.