whatwg / url

URL Standard
https://url.spec.whatwg.org/
Other
527 stars 137 forks source link

Inconsistency in Handling `special-scheme-missing-following-solidus` URLs #822

Closed RedYetiDev closed 7 months ago

RedYetiDev commented 7 months ago

What is the issue with the URL Standard?

Description:

The current implementation of the WhatWG URL standard exhibits inconsistency in handling special-scheme-missing-following-solidus URLs, leading to unexpected behavior and discrepancies across various scenarios. This bug report aims to elucidate the issue and propose a standardized approach for handling such URLs to enhance consistency and predictability.

Current Behavior:

The handling of special-scheme-missing-following-solidus URLs is outlined in the provided table:

BASE INPUT Result
http://example.com/ http:web.site http://example.com/web.site
http://example.com/ https:web.site https://web.site/
https://example.com/ http:web.site http://web.site/
https://example.com/ https:web.site https://example.com/web.site

The URLs are outlined in code blocks to circumvent GitHub's automatic URL resolution

Issue:

The existing handling of special-scheme-missing-following-solidus URLs results in inconsistencies, particularly when the URL scheme differs from the base URL scheme. For example, in the second row of the table, the resulting URL is https://web.site/, which may not align with user expectations.

Suggested Improvement:

To standardize and enhance predictability, I propose adopting one of the following approaches:

  1. Consistent Domain Resolution:

    Ensure that the resolved URL consistently reflects the domain specified in the input, irrespective of the base URL scheme. This approach fosters consistency across diverse scenarios and mitigates unexpected outcomes.

    Revised Table:

    BASE INPUT Result
    http://example.com/ http:web.site http://web.site
    http://example.com/ https:web.site http://web.site/
    https://example.com/ http:web.site http://web.site/
    https://example.com/ https:web.site https://web.site
  2. Preserve Base Domain:

    Ensure that the resolved URL incorporates the base domain when the input URL does not explicitly specify it. This approach maintains clarity regarding the base domain and yields predictable outcomes when the domain is omitted in the input URL.

    Revised Table:

    BASE INPUT Result
    http://example.com/ http:web.site http://example.com/web.site
    http://example.com/ https:web.site http://example.com/web.site/
    https://example.com/ http:web.site https://example.com/web.site/
    https://example.com/ https:web.site https://example.com/web.site

Conclusion:

Standardizing the handling of special-scheme-missing-following-solidus URLs will bolster consistency, predictability, and interoperability across various implementations of the WhatWG URL standard. The proposed improvements aim to rectify the existing inconsistency and ensure a more coherent behavior in URL resolution.

annevk commented 7 months ago

This is a very well written issue report, however, I'm afraid that the request goes counter to one of our goals: https://url.spec.whatwg.org/#goals. In particular getting alignment across implementations. This change would likely not be web compatible and lead to broken experiences for end users.

(Apologies if browsers match what you propose and you just forgot to write that down. I'm pretty sure they don't, but I didn't double check.)

RedYetiDev commented 7 months ago

Align RFC 3986 and RFC 3987 with contemporary implementations and obsolete the RFCs in the process. (E.g., spaces, other "illegal" code points, query encoding, equality, canonicalization, are all concepts not entirely shared, or defined.) URL parsing needs to become as solid as HTML parsing. [RFC3986] [RFC3987]

I have tried to report this issue to some browsers, and here is what I got

Mozilla

From https://bugzilla.mozilla.org/show_bug.cgi?id=1879227

Me: Okay, but I still think it’s an issue, maybe with WhatWG. I’ll ask them and find out. Thank you Mozilla: I agree. The URL is parsed according to the URL standard Mozilla: This specific case would be a validation error, but either way the parsing is correct and all browsers agree on this behaviour.

From that snippet, in my opinion, it is unclear whether they agree or disagree on this, as they said that the parsing is correct and all browsers agree on this behaviour., but they also said "I agree" when I was referring to this behavior as an issue

Chromium

I tried to report it through Chromium's issue tracker, but I can't see the current status of my bug report

annevk commented 7 months ago

I think Valentin agreed with Tom there, not with you.

RedYetiDev commented 7 months ago

You're probably right. What does this mean for this bug report?

annevk commented 7 months ago

I'll close it as this is not something we want to change. Thanks for taking the time to report it!