transitive-bullshit / check-links

Robustly checks an array of URLs for liveness. Extremely fast ⚡
MIT License
334 stars 10 forks source link

Investigate false negatives #4

Open transitive-bullshit opened 5 years ago

transitive-bullshit commented 5 years ago

https://github.com/sindresorhus/awesome-lint/issues/50#issuecomment-463132897

transitive-bullshit commented 5 years ago

My guess is these links aren't responding to HEAD and are taking too long to respond to GET requests, but further investigation is needed.

HorlogeSkynet commented 5 years ago

--> Yes, it happened with https://thinkerview.com/, a WordPress that had been very slow to respond back in the past (and thus failed the CI with awesome-lint). Unfortunately for you, I don't have any "non-working" example at the moment 🙄

transitive-bullshit commented 5 years ago

I'm considering increasing the default 10000ms timeout to something larger like 30000ms since I'm really not sure what else can be done in terms of retries or robustness logic.

There will of course always be cases that look like false negatives no matter what (like a server not responding when run from CI but then being alive when the user tries the link manually), but I feel like it shouldn't hurt much to increase the default timeout.

My thinking is that most real invalid URLs will not timeout but rather return an HTTP status code eagerly that got will not retry with and we'll fail fast, whereas sporadically slow to respond servers need more than 10s to respond sometimes.

@davidtheclark @sindresorhus thoughts on this change?

transitive-bullshit commented 5 years ago

Okay; I've update the default timeout to 30000 ms and released v1.1.6.

If anyone else has any ideas on how to improve robustness for mitigating false negatives, please follow up on this thread.

mischah commented 5 years ago

Hej @transitive-bullshit,

I still have a permanent false negative with this link running awesome-lint on Travis: https://vimeo.com/181328943

See Travis build log.

According to my package-lock.json check-links 1.1.7 is used.

Locally it’s working fine.

transitive-bullshit commented 5 years ago

Thanks @mischah will investigate what's going on when I have some time.

It's entirely possible that vimeo and other media hosting sites blacklist certain cloud provider IPs like AWS where CI is inevitably being run in order to mitigate piracy, but that's just a hypothesis. It would certainly explain why the vimeo link works locally but isn't accessible via CI.

sindresorhus commented 5 years ago

Some more: https://github.com/sindresorhus/awesome/pull/1532#issuecomment-467538604

sindresorhus commented 5 years ago

Another one: https://github.com/sindresorhus/awesome/pull/1330#issuecomment-467358884

stevesong commented 5 years ago

I have a false negative when running awesome-lint for https://www.facebook.com/ads/audience-insights/people

sindresorhus commented 5 years ago

More: https://github.com/sindresorhus/awesome/pull/1370#issuecomment-467620142

NewAlexandria commented 5 years ago

Another one: https://travis-ci.com/NewAlexandria/awesome-livecoding/builds/105855505

HorlogeSkynet commented 5 years ago

Another one : https://creativecommons.org/publicdomain/zero/1.0/

(From CI : https://travis-ci.org/HorlogeSkynet/awesome-thinkerview/builds/526068957 + Deprecation warning ?)

++ :wave:

EDIT : ... that is not anymore :thinking:

irazasyed commented 5 years ago

And here's one more: https://github.com/sindresorhus/awesome-lint/issues/88

jessevdp commented 5 years ago

Another false-positive. This one first shows a 500 error, then 1 sec later shows the actual content.

https://blog.ginetta.net/getting-started-with-gatsby-and-cockpit-part-1-of-2-d86871932d44

innocenzi commented 4 years ago

More false-positives here:

  ×  114:3  Link to https://hellosun.brussels is dead                               remark-lint:no-dead-urls
  ×  236:5  Link to https://codepen.io/adamwathan/pen/RxWrZr is dead                remark-lint:no-dead-urls
  ×  238:5  Link to https://codepen.io/drehimself/full/vpeVMx is dead               remark-lint:no-dead-urls
  ×  244:3  Link to https://codepen.io/joshmanders/pen/PQQBoR is dead               remark-lint:no-dead-urls
iAdramelk commented 3 years ago

One more example:

  278:37-278:94  warning  Link to https://linuxize.com/post/linux-chown-command/ is dead  no-dead-urls  remark-lint
undergroundwires commented 3 years ago

ko-fi links do not work anymore, they worked fine until a month ago 🤔

https://ko-fi.com/undergroundwires https://ko-fi.com/undergroundwires

davidtheclark/remark-lint-no-dead-urls#29