whatwg / url

URL Standard
https://url.spec.whatwg.org/
Other
527 stars 137 forks source link

Initialize the IgnoreInvalidPunycode flag when calling UTS 46 #821

Open hsivonen opened 7 months ago

hsivonen commented 7 months ago

What is the issue with the URL Standard?

UTS 46 revision 31 added a IgnoreInvalidPunycode flag to its ToASCII and ToUnicode operations. The URL Standard should be explicit about the value of this flag when it calls into ToASCII or into ToUnicode.

hsivonen commented 6 months ago

AFAICT, the current behavior of Firefox and Safari would be consistent with setting this flag to false and Chrome’s behavior would be consistent with setting this flag to true.

Looking at how browsers comply with the existing spec, Safari seems to comply well, Firefox seems to comply except Firefox fails to enforce bidi rule on LTR labels in a bidi domain name (i.e. Firefox enforces the bidi rule on a per-label basis), and Chrome’s behavior seems hard to explain from the spec.

These observations would support setting IgnoreInvalidPunycode to false. However, I’m missing some context of why the IgnoreInvalidPunycode flag was introduced in UTS 46. The rationale says it enables an ASCII fast path, but UTS 46 still requires validating xn-- labels that decode successfully as Punycode, so the flag does not, AFAICT, enable an ASCII fast path in general (and the “industry practice” evidently doesn’t cover Firefox and Safari).

@markusicu, @macchiati, can you share more context for the motivation of IgnoreInvalidPunycode and how you’d expect the URL Standard to set the flag?

macchiati commented 6 months ago

I can't remember off the top of my head; would have to look back at the development notes.

---------- Forwarded message --------- From: Henri Sivonen @.> Date: Fri, Mar 1, 2024, 04:37 Subject: Re: [whatwg/url] Initialize the IgnoreInvalidPunycode flag when calling UTS 46 (Issue #821) To: whatwg/url @.> Cc: Mark Davis @.>, Mention @.>

AFAICT, the current behavior of Firefox and Safari would be consistent with setting this flag to false and Chrome’s behavior would be consistent with setting this flag to true.

Looking at how browsers comply with the existing spec, Safari seems to comply well, Firefox seems to comply except Firefox fails to enforce bidi rule on LTR labels in a bidi domain name (i.e. Firefox enforces the bidi rule on a per-label basis), and Chrome’s behavior seems hard to explain from the spec.

These observations would support setting IgnoreInvalidPunycode to false. However, I’m missing some context of why the IgnoreInvalidPunycode flag was introduced in UTS 46. The rationale says it enables an ASCII fast path, but UTS 46 still requires validating xn-- labels that decode successfully as Punycode, so the flag does not, AFAICT, enable an ASCII fast path in general (and the “industry practice” evidently doesn’t cover Firefox and Safari).

@markusicu https://github.com/markusicu, @macchiati https://github.com/macchiati, can you share more context for the motivation of IgnoreInvalidPunycode and how you’d expect the URL Standard to set the flag?

— Reply to this email directly, view it on GitHub https://github.com/whatwg/url/issues/821#issuecomment-1973116108, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACJLEMCPDTNYLKLQTLNVWXLYWBY77AVCNFSM6AAAAABC3OVTROVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZTGEYTMMJQHA . You are receiving this because you were mentioned.Message ID: @.***>

annevk commented 6 months ago

Yeah I don't understand this either. This was not part of our feedback to UTS46 last year (#744) and I would not want ASCII special casing of this sort.