whatwg / url

URL Standard
https://url.spec.whatwg.org/
Other
527 stars 137 forks source link

API mechanism for reporting validity errors #811

Open jasnell opened 7 months ago

jasnell commented 7 months ago

What is the issue with the URL Standard?

The spec specifically calls out validity errors while parsing and recommends that implementations report those in some way. The URL API does not surface those in any way and leaves it up to the host implementation to report validity errors out of band.

It would likely be useful for the API to provide some reasonable means of handling validity state...

For instance, a few approaches come to mind:

A. Adding an valid or invalid property to URL. The value would be boolean. It would not report the specific validation error but would simply indicate whether or not a validation error exists.

const url = new URL('file:foobar');
console.log(url.valid); // false

B. Adding a validationErrors property that is an array of validation errors. These can be numeric codes or strings that represent the specific collection of validation errors defined in the spec. A null value for this property would indicate no validation errors. If a value is provided it must be an array. This would allow both simple checking for valid/invalid and specific reporting for individual validation errors.

const url = new URL('file:foobar');
console.log(url.validationErrors); // ['special-scheme-missing-following-solidus']

C. Adding an opt-in strict parsing mode that converts validation errors into thrown exceptions.

const url = new URL.Strict('file:foobar');  // throws!

Of these options, I generally prefer A as I do not really think there's much practical reason to break down exactly why the URL is invalid as much as simply calling it out.

karwa commented 7 months ago

With regards to adding properties on the URL value: what happens if you mutate the value using one of the property setters?

For instance, let's imagine the path contains a % sign which is not part of a percent-encoded byte (e.g. https://example.org/%s). This would produce an invalid-url-unit validation error, and url.valid (option A) of the resulting value should return false.

Now let's imagine somebody uses the path setter to change the path to something valid. Does url.valid still return false?

In general, how do we know there is not some other component which also contains non-valid contents (e.g. the same %s in the query)?

jasnell commented 7 months ago

For options A and B, the new properties would reflect the current state of the sum of the URL components, so if modifying a setter value changes the validity of the URL, the property would be changed to reflect that.

karwa commented 7 months ago

It would also reflect the last operation, wouldn't it? For instance, non-fatal validation errors can occur when an IPv4 address contains non-decimal parts - e.g. https://0x7F.1. What's interesting about this is that the parser reformats the IP address so that it is valid - the above produces the URL https://127.0.0.1.

In other words, this property would not be idempotent.

const url = new URL('https://0x7F.1');
console.log(url.valid); // false

const url2 = new URL(url.href)
console.log(url2.valid); // true!

I think this would be extremely difficult for users to understand or use effectively.

This suggests that validity may not in fact be a property of the URL, but rather a property of the inputs used to create the URL. So C (including "strict" versions of property setters, etc) would be the best API, IMO.

annevk commented 7 months ago

It would likely be useful for the API to provide some reasonable means of handling validity state.

I think we need to start with flushing this out more. See also 1 of https://whatwg.org/faq#adding-new-features.