Closed rubycon closed 6 months ago
I don't think WebIDL conversion is applicable here, but nonetheless I agree that we need to replace null with an empty string in a few places here.
@jeremyroman There's somewhat a related bug when creating a pattern parser with some of the encoding callbacks.
These callbacks also rely on URL spec's basic parser and do not check for null
value with the exception of the port canonicalizing callback but as with the base URL the integer port should also be serialized or it fail the first escape a regexp string assertion.
As I understand these callbacks should always return a string, as all patterns are string.
Acknowledged on the port needing to be serialized there. The others seem fine to me, as they initialize the URL record's relevant field into a non-null state and then proceed with parsing from a state that I think will never set them back to null. Can you be more specific?
@jeremyroman Yes, it seems fine for the other components as null
is always replaced by "*"
during the component compilation.
But should processedInit
fields be set to null
in the first place when they must be USVString
. If it's OK to do so for internal structs that are not exposed through the API I think it shouldn't be an issue.
This issue is also discussed here https://github.com/whatwg/urlpattern/issues/204#issuecomment-1891552179
Well, it's not that they must be USVString
, it's that they must be USVString
if they are present. Web IDL dictionaries are ordered maps and the dictionary members are only required keys if they are specified as required
.
It is incorrect to initialize these to null; instead, we should be conditionally assigning them and leaving them absent from the map otherwise.
Hi @jeremyroman, hostnames seems to suffer from the same issue. The basic URL parser will parse a host as a hostname or an IP address. Addresses can be represented as an single integer or as an array of integer, so they also need to be explicitly serialized, here :
To canonicalize a hostname given a string value
- 5. Return dummyURL’s host.
The parsed host and port in the match steps will also need to be properly serialized :
To perform a match given a URLPattern urlpattern, a URLPatternInput input, and an optional string baseURLString:
@jeremyroman Also: minor typo found in canonicalize a port. Argument passed as portValue and then referred as value.
@jeremyroman @annevk I submitted a PR to fix this issue + the minor typo but it seems it's still pending. I applied Jeremy recommendations to my commit message and submitted a participation agreement.
I'm new to this process, is there something else I'm missing ?
@rubycon did your PR fully resolve this issue? (i.e., shall I close it now?)
@jeremyroman Yes, it's fully resolved!
What is the issue with the URL Pattern Standard?
A correct implementation of the current spec fail with this kind of simple input:
The issue comes in part from the use of URL spec's internal parsers. These parsers manipulate URL records with some fields (host, port, query, fragment) that default to
null
when empty.When those empty fields are then used to fill the default url pattern (step 11-7 to 11-10 of process a URLPatternInit), the processing a base URL string assertion fail (Assert: input is not null.)
In the special case of the baseURL port, URL record use unsigned integers and not strings, so the assertion pass but the code fail further down in escape a pattern string (1. Assert: input is an ASCII string.).
As per Web IDL, implicit string conversion would yield
"null"
for those empty field (as the URL record struct is not marked[LegacyNullToEmptyString]
) which is not what we want.The spec should check if a baseURL field is not
null
before using it to fill the url pattern. As for the port it should be converted to a string but is the processing of the stringified port really necessary as we are certain its only ASCII characters and there's nothing to escape ?).