JuliaLang / www.julialang.org

Julia Project website
https://julialang.org
Other
359 stars 438 forks source link

Chasing dead links #690

Closed tlienart closed 2 years ago

tlienart commented 4 years ago

Right so I ran a script to check all pages, it's not perfect but it should help us find quite a few of the deadlinks, list below. Maybe that people could take responsibility in fixing things and pinging me in the PR so that I can mark it as done here.

There's a bunch of dead meetups etc that should probably just be removed

@ Helpers: there's two things you can do:

Edit:

Script

(skip this, it's only for maintenance purpose, I might eventually put this in the README)

It uses blc and is not pretty but it kind of does an ok job:

# modify this to wherever your npm installs stuff
const BLC = "/usr/local/Cellar/node/11.10.1/lib/node_modules/broken-link-checker/bin/blc"
@assert success(`$BLC -V`)
function check_page(url)
    open("tempf", "w") do outf
        redirect_stdout(outf) do
            try run(`$BLC $url`); catch; end # sometimes BLC does weird stuff
        end
    end
    output = readlines("tempf")
    rm("tempf")
    for line in output
        startswith(line, "├─BROKEN─") || continue
        tmp = replace(line, "├─BROKEN─ " => "")
        println("  * [ ] $tmp")
    end
end
# modify this to your local version of the site, assumes you've built it.
const BASE_PATH = "/Users/tlienart/Desktop/www.julialang.org/__site"
for (root, _, files) in walkdir(BASE_PATH)
    for file in files
        file == "index.html" || continue
        fp = replace(joinpath(root, file), BASE_PATH => "")
        fp = "https://julialang.org" * fp
        println("* [ ] $fp")
        check_page(fp)
    end
end

List of dead links

Notes:

Errors

Chunk 1 (checked)

Chunk 2 (checked)

Chunk 4

Chunk 6

Chunk 8

Chunk 9

Chunk 10

Chunk 11

Chunk 14

Chunk 15

tlienart commented 4 years ago

I realise a bunch of these BLC_UNKNOWN errors are due to me adding https: instead of http: massively. E.g.:

http://www-math.mit.edu/~edelman/ vs https://www-math.mit.edu/~edelman/

ViralBShah commented 4 years ago

For that particular one, the right link is https://math.mit.edu/~edelman/

tlienart commented 4 years ago

oh man this is hard work...

tlienart commented 4 years ago

alright did about half, many of which were due to the HTTPS. Another low hanging fruit are all the github ones and error 429...

ViralBShah commented 4 years ago

Thank you for this tireless effort!

ViralBShah commented 4 years ago

The github URLs all appear valid. What's going on?

ViralBShah commented 4 years ago

We should announce on #website and #general to look for help, but perhaps remove the 429 and 999 URLs from this list?

tlienart commented 4 years ago

Yeah I think GitHub may have a strict robot, I’ll do another pass to do more pruning.

One thing that would be good is advice for dead pages, do we just remove the link or do we replace it with an indication that there was a link that’s now dead?

ViralBShah commented 4 years ago

Not sure if it is worth the effort to do more. I think we can just remove.

tlienart commented 4 years ago

no no, not done yet...

ViralBShah commented 4 years ago

Bump. Help appreciated here.

tlienart commented 4 years ago

I imagine a few of those are not relevant anymore, would be good to re-run the script (bottom of readme, I'm on my phone now so I can't do it but can try later)

ashwani-rathee commented 3 years ago

So all the errors are left to be worked on (after chunk 4),right??

tlienart commented 3 years ago

yes that's correct; this list might be a bit outdated now but basically the steps are:

  1. is the faulty link still there?
    1. yes --> is there an alternative link that works?
    2. yes --> change it in a PR (ideally do multiple links per PRs though maybe at most 20-30 to facilitate reviewing)
    3. no --> remove the link
    4. no --> reply here with a bunch of links that are irrelevant; I'll update the list

that's about it; then we should re-run the tool to see if there are any stray links in the mix.

HarshCasper commented 3 years ago

Hi @tlienart @ViralBShah

I would like to work on this Issue back again. Since this issue has been open for a very long time now, and major chunks of dead links are still left to be reviewed and fixed, I would like to take it up and clean it to give the Project a better shape.

I have recently learned about Julia and would like to contribute to Julia Open-Source in all ways possible. Kindly let me know if I can start working on it, or is there any other obligation I need to fulfil for the same.

I am looking to get started with contributing to Julia with this 😄

ViralBShah commented 3 years ago

Thanks @HarshCasper. Just open PRs fixing the dead links. You can do so in batches. Don't mix any other changes into the PRs that are fixing the dead links - so that they can be quickly merged.

ViralBShah commented 2 years ago

I think we just have to leave the external links as they are. It's impossible to play catch up.