Open JessRudder opened 2 years ago
I've noticed what appears to be inconsistent behavior with URI(my_uri). I tried reading the RFCs that URI is based on to understand if this was by design (but I couldn't parse the RFC all that well).
Please have a look at https://www.rfc-editor.org/rfc/rfc3986#appendix-A. The starting production you need is URI-reference, which includes relative URIs.
If I run
URI("*.example.com:5555")
I get the following error:URI::InvalidURIError: bad URI(is not URI?): "*.example.com:5555" from /Users/jessrudder/.rbenv/versions/3.1.2/lib/ruby/3.1.0/uri/rfc3986_parser.rb:67:in `split'
The first colon is the delimiter between the scheme and the rest, and "*.example.com" isn't a valid scheme, so this is expected.
If I keep the wildcard and the port but add a scheme, I get the behavior I expected
URI("http://*.example.com:5555")
scheme: "http" userinfo: nil host: "*.example.com" port: 5555 path: ""
Of course.
If I don't have the wildcard domain or a scheme, it appears to work but not all methods respond as expected
URI("subdomain.example.com:5555")
scheme: "subdomain.example.com" user_info: nil host: nil port: nil path: nil
Again, the part before the first colon is the scheme, but now "subdomain.example.com" is okay as a scheme. Periods are allowed in schemes, but this scheme doesn't exist. The library could use generic syntax for the rest (which would then put 5555 into path, but the library probably uses scheme-specific code for the rest of the URI, which doesn't exist. That's why I guess the rest of the fields are empty.
If I remove the port and don't have a scheme, it appears to work but not all methods respond as expected
URI("*.example.com")
scheme: nil userinfo: nil host: nil port: nil path: "*.example.com"
Relative URIs are first and foremost for navigating inside a single site. That's why a bare "*.example.com" is interpreted as a path component, not as a host.
I'd be happy to try to work on a PR but wanted to confirm that this behavior was incorrect before I did that. Thanks!
Well, to me the behavior looks correct. To most humans, things such as "subdomain.example.com" smell strongly of domain names, but they could be a scheme or a path component, too.
I've noticed what appears to be inconsistent behavior with URI(my_uri). I tried reading the RFCs that URI is based on to understand if this was by design (but I couldn't parse the RFC all that well).
If I run
URI("*.example.com:5555")
I get the following error:If I keep the wildcard and the port but add a scheme, I get the behavior I expected
URI("http://*.example.com:5555")
If I don't have the wildcard domain or a scheme, it appears to work but not all methods respond as expected
URI("subdomain.example.com:5555")
If I remove the port and don't have a scheme, it appears to work but not all methods respond as expected
URI("*.example.com")
I'd be happy to try to work on a PR but wanted to confirm that this behavior was incorrect before I did that. Thanks!