dask / dask-examples

Easy-to-run example notebooks for Dask
https://examples.dask.org/
Creative Commons Attribution Share Alike 4.0 International
371 stars 226 forks source link

Regularly check for broken links #152

Open hammer opened 4 years ago

hammer commented 4 years ago

Inspired by https://github.com/dask/dask-examples/pull/151, which I encountered manually, I just ran http://examples.dask.org through a dead link checker and found a handful of broken links. I don't have time to fix them all right now but I just thought I'd drop the results here.

It might be a good idea to include a dead link check as part of the website deploy, but that may also be overkill!

Status URL Source link text
-1 Invalid URL http://127.0.0.1:8787/status http://127.0.0.1:8787/status
404 Not Found https://docs.dask.org/en/latest/bag-overview.html Dask Bag Documentation
404 Not Found https://www.continuum.io/sites/default/files/dask_stacked.png
-1 Invalid URL http://10.20.0.141:8787/status http://10.20.0.141:8787/status
404 Not Found https://ml.dask.org/examples/xgboost.html http://ml.dask.org/examples/xgboost.html
404 Not Found https://xgboost.readthedocs.io/en/latest/python/python_intro https://xgboost.readthedocs.io/en/latest/python/python_intro
404 Not Found https://distributed.readthedocs.io/en/latest/local-cluster.html local cluster
404 Not Found https://docs.scipy.org/doc/numpy-1.16.0/reference/c-api https://docs.scipy.org/doc/numpy-1.16.0/reference/c-api
404 Not Found https://scikit-learn.org/stable/modules/scaling_strategies.html user guide [301 from http://scikit-learn.org/stable/modules/scaling_strategies.html]
404 Not Found https://numpy.org/doc/stable/reference/c-api.generalized-ufuncs.html Generalized Universal Functions [302 from https://docs.scipy.org/doc/numpy/reference/c-api.generalized-ufuncs.html]
404 Not Found https://examples.dask.org/proxy/8787/status dashboard's status page
404 Not Found https://examples.dask.org/proxy/8787/graph dashboard's graph page
404 Not Found https://examples.dask.org/applications/' Cleaning up temporary directories and files
404 Not Found https://examples.dask.org/applications/clip.gif img/src
404 Not Found https://examples.dask.org/surveys/examples.dask.org dask examples
-1 Timeout http://www.celeryproject.org/ Celery
404 Not Found https://distributed.readthedocs.io/en/latest/setup.html scale out to a cluster
mrocklin commented 4 years ago

cc @dask/maintenance

martindurant commented 4 years ago

Note that all the ones with 8787 are clearly not meant to exist, they would refer to a running scheduler. Adding a link checker isn't a bad idea, but I wouldn't require success it for a PR to pass.

mrocklin commented 4 years ago

Yeah, I'm more concerned with fixing the links pointing to old or stale documentation. I agree that many of these probably came from docs that referred to addreses generally.

jsignell commented 4 years ago

Yeah the built docs have the same issue. I agree fixing is more important than adding a CI check.

jsignell commented 4 years ago

Especially since it will always be external changes that would cause a failure.

mrocklin commented 4 years ago

Well, I think that CI would also be grand, if only to make us aware of failures as they arise due to external changes. I'll take what I can get though. Fixing exisitng links, or adding redirects upstream (see docs/conf.py in most repositories) should be an easy fix for most folks.

pratyakshajha commented 3 years ago

I would like to take this up if anyone else hasn't already. I can start by manually fixing the URLs listed in the issue.

jrbourbeau commented 3 years ago

That would be welcome, thanks @pratyakshajha!