iterative / link-check

A Node-based tool to verify if links are alive. Built to be used anywhere!
4 stars 4 forks source link

broken links not caught when page was removed #12

Closed jorgeorpinel closed 2 years ago

jorgeorpinel commented 2 years ago

https://github.com/iterative/dvc.org/pull/3275/checks passed even when that PR removed a page that was at the time linked from multiple places (links fixed in https://github.com/iterative/dvc.org/pull/3327).

Was this a link checker error?

julieg18 commented 2 years ago

No, I don't think so. I'm pretty sure the link-check action that runs on prs only checks the links inside the edited files. Since it didn't run for the entire site, it didn't detect the broken links. But, @rogermparent, am I wrong?

rogermparent commented 2 years ago

@julieg18 is exactly right here, it specifically reads the git diff on PRs like the original script it replaced did.

However, this opens up an interesting question: would be it better just to go through the whole site and target that use case? It would catch situations like this and mean we maintain a lot less code. It would also make it easier to use something like lychee and drop us having to maintain a link check package altogether.

julieg18 commented 2 years ago

However, this opens up an interesting question: would be it better just to go through the whole site and target that use case?

Currently, we have around 40 broken links throughout the site. If we can get those cleared up and get a package that has a low chance of false negatives, I don't see a problem with having a check that checks the entire site on prs. But if we did it now, the check would probably be too noisy for every pr...

casperdcl commented 2 years ago

checks the entire site on prs

jorgeorpinel commented 2 years ago

Yeah no need to change the behavior. But there is a site-wide link check process somewhere right? I keep forgetting where to see that. Would be ideal getting notified as well.

julieg18 commented 2 years ago

Yeah no need to change the behavior. But there is a site-wide link check process somewhere right? I keep forgetting where to see that. Would be ideal getting notified as well.

Yes, it's run once a day as a Github Action called "Check all links in the repository". We'll probably want to wait on notifications till we improve the link-check though. Currently, it's very noisy with a lot of false negatives.

jorgeorpinel commented 2 years ago

OK so I guess this issue is unnecessary. We can work on https://github.com/iterative/dvc.org/issues/2486 instead for now. Thanks