wjdp / htmltest

:white_check_mark: Test generated HTML for problems
MIT License
323 stars 54 forks source link

IgnoreURLs fails if URL given is not the domain #109

Closed andymule closed 2 years ago

andymule commented 5 years ago

IgnoreURLs only seems to ignore matching domains. Given the name, I'm guessing it should also be able to ignore both www.anydomain.com/folder1/none.html and file:///C:///stupidfolder/folder3/none.html

with a setting like:

IgnoreURLs:
- "none.html"
wjdp commented 5 years ago

Probably another issue with the file scheme. I don't think this option supports specifying the scheme (unsure).

christianrondeau commented 5 years ago

Just had a quick check myself since I have the same issue. My case is a link like this: /download/file.zip and the ignore string is download, and it'll still validate.

I won't be able to make a PR but looking at the code quickly, it seems IgnoreURLs is only applied for external links (specifically http and https): https://github.com/wjdp/htmltest/blob/586df577838e2bec059948190714fbabf9c2981f/htmltest/check-link.go#L78

The cheap way of solving this would be simply check the regex against other schemes. Something like copying this in checkInternal and checkInternalHash:

    if hT.opts.isURLIgnored(urlStr) {
        return
    }

And adding a test against the path here: https://github.com/wjdp/htmltest/blob/586df577838e2bec059948190714fbabf9c2981f/htmltest/check-link_test.go#L30

Hope that helps someone who has time for a PR :) Or me eventually, who knows!

lucperkins commented 4 years ago

My experience also confirms that IgnoreURLs does not work with internal links.

catherineluse commented 4 years ago

In my case I'm updating documentation separately from the rest of the company website, so I would like to ignore links to example.com/products, while still validating links to example.com/docs.