w3c / link-checker

Check links and anchors in Web pages or full Web sites.
https://validator.w3.org/checklink
121 stars 38 forks source link

`#top`/`#` anchors incorrectly flagged as broken/missing anchor errors #55

Closed gwern closed 3 years ago

gwern commented 3 years ago

If I check a page on my website, such as the index, where I have a link to #top at the bottom of the page for the convenience of readers (along with the standard 'return-to-top' floating widget), the link checker throws an error:

Lines: 2552, 2567, 2581, 3164 https://www.gwern.net/index Status: 200 OK

Some of the links to this resource point to broken URI fragments (such as index.html#fragment). Broken fragments:

   https://www.gwern.net/index#top (line 3164)

The link works fine, and as I understand it, #top and # are guaranteed to exist and be valid anchor references defined at runtime by a standards-compliant browser according to [MDN]():

You can use href="#top" or the empty fragment (href="#") to link to the top of the current page, as defined in the HTML specification.

The specification linked does say

Let fragment be the document's URL's fragment. If fragment is the empty string, then the indicated part of the document is the top of the document; return.

So # is definitely required to exist & be valid by the standard; I'm unsure where top is defined but MDN and everyone else seems to think it's defined exactly the same way.

So both #top and # will always be valid anchor links, and any error by the link-checker is always a false positive. They should be whitelisted and not appear in the output.

dontcallmedom commented 3 years ago

thanks for the clear bug report - this should now be fixed