nextstrain / nextclade

Viral genome alignment, mutation calling, clade assignment, quality checks and phylogenetic placement
https://clades.nextstrain.org
MIT License
214 stars 58 forks source link

DOC: Some links still point to old (no longer existing) Github URLs #963

Closed corneliusroemer closed 2 years ago

corneliusroemer commented 2 years ago

Matthijs Welkers kindly reported that our docs still contain links to Github release assets based on v1 binary names.

It would be great if we could sift through docs and replace things

Example: image

image

j23414 commented 2 years ago

In case it helps, there may be a way to programmatically find the broken links:

It's been a couple years since I looked at automagically link-checking Jekyll pages, and I think we've since moved to Sphinx.

ivan-aksamentov commented 2 years ago

Fixed in https://github.com/nextstrain/nextclade/pull/964

ivan-aksamentov commented 2 years ago

@j23414 Can you please advice on how to integrate the linkcheck to the current build system? https://github.com/nextstrain/nextclade/blob/18aa4e308c4dc566b71c85097f4ee0cada28f8b2/docs/Makefile#L6

j23414 commented 2 years ago

Sure, I'm willing to explore! I'll try building docs locally using that makefile. Hopefully it's only SPHINXOPT='-b linkcheck' and somehow capture the list of broken links.

Hmm, I wouldn't put it as a github action as I wouldn't want the build to "fail" just because of a broken link. Just flag the broken link so someone can fix it later.

ivan-aksamentov commented 2 years ago

Hopefully it's only SPHINXOPT='-b linkcheck' and somehow capture the list of broken links.

I tried, but nothing happened. I just don't fully understand how it works.

wouldn't want the build to "fail" just because of a broken link.

Definitely not need to fail the build. Some warning in the terminal would be alright.

j23414 commented 2 years ago

Ah, here's the local build:

git clone https://github.com/nextstrain/nextclade.git
cd nextclade/docs
conda env create
conda activate docs.clades.nextstrain.org

# Add the link check
make -b linkcheck html &> msgs.txt

# Pull broken links, ignore any changelog messages since they should be out of date
cat msgs.txt \
  | grep "broken" \
  | grep -v "CHANGELOG" \
  | sort 
  | less

which gave me:

(   user/datasets: line  158) broken    https://github.com/nextstrain/nextclade_data_workflows - 404 Client Error: Not Found for url: https://github.com/nextstrain/nextclade_data_workflows
(user/algorithm/01-sequence-alignment: line    5) broken    nextclade-cli - 
(user/algorithm/02-translation: line    5) broken    ../terminology.html#gene-map - 
(user/algorithm/02-translation: line    5) broken    nextalign-cli - 
(user/algorithm/02-translation: line    5) broken    nextclade-web - 
(user/algorithm/02-translation: line   13) broken    ../terminology.html#peptide - 
(user/algorithm/06-clade-assignment: line    3) broken    ../terminology.html#clade - 
(user/algorithm/06-clade-assignment: line    5) broken    05-phylogenetic-placement.html#known-limitations - 
(user/algorithm/nextclade-pango: line   12) broken    https://academic.oup.com/ve/article/7/2/veab064/6315289 - 403 Client Error: Forbidden for url: https://academic.oup.com/ve/article/7/2/veab064/6315289
(user/algorithm/nextclade-pango: line   40) broken    https://academic.oup.com/mbe/article/37/5/1530/5721363 - 403 Client Error: Forbidden for url: https://academic.oup.com/mbe/article/37/5/1530/5721363
(user/algorithm/nextclade-pango: line   46) broken    https://academic.oup.com/ve/article/4/1/vex042/4794731 - 403 Client Error: Forbidden for url: https://academic.oup.com/ve/article/4/1/vex042/4794731
(user/input-files: line    7) broken    terminology.html#query-sequence - 
(user/nextclade-cli: line  157) broken    (https://anaconda.org/bioconda/nextclade) - 
(user/output-files: line    7) broken    ../_images/web_download-options.png - 
(user/output-files: line  368) broken    terminology.html#reference-tree-concept - 
(user/terminology: line  125) broken    algorithm#alignment - 
(user/terminology: line  145) broken    algorithm#phylogenetic-placement - 

Summarized fixes here: 👈

The rest of the flagged links worked fine, and may be some issues with linkcheck failing to follow an anchor link or some kinda of user agent problem.