b-cubed-eu / documentation

B-Cubed documentation website
https://docs.b-cubed.eu
Creative Commons Attribution 4.0 International
0 stars 2 forks source link

Upload D3.1 requirements #13

Closed peterdesmet closed 5 months ago

peterdesmet commented 6 months ago

Chapters:

Conversion per chapter:

Editorial per chapter:

peterdesmet commented 6 months ago

Todo:

peterdesmet commented 6 months ago

Guide is online at https://docs.b-cubed.eu/dev-guide/ Needs some proofreading to test all links. @LauraAbr is that something you can do?

PietrH commented 5 months ago

I've written a little script to mine all the external links of the dev guide, I've included them below.

With this list, I used httr2 to check the http response code of the link. All but 2 resolved:

https://support.posit.co/hc/en-us/articles/200526207-Using-RStudio-Projectshttps://github.com/b-cubed-eu/documentation/tree/main/tutorials

Complete list of external links:

- https://github.com/just-the-docs/just-the-docs
- https://github.com/b-cubed-eu/documentation
- https://www.rfc-editor.org/rfc/rfc2119
- https://b-cubed.eu/storage/app/uploads/public/65e/1b2/2a0/65e1b22a0b85c121473896.pdf
- https://docs.b-cubed.eu/dev-guide/
- https://docs.github.com/en/get-started/quickstart/github-glossary
- https://github.com
- https://github.com/orgs/b-cubed-eu/repositories
- mailto:laura.abraham@plantentuinmeise.be
- https://docs.github.com/en/repositories/creating-and-managing-repositories/quickstart-for-repositories
- https://github.com/organizations/b-cubed-eu/repositories/new
- https://b-cubed.eu/storage/app/uploads/public/64e/f45/6cd/64ef456cd4da1356663578.pdf
- https://docs.github.com/en/migrations/importing-source-code/using-the-command-line-to-import-source-code/adding-locally-hosted-code-to-github#initializing-a-git-repository
- https://en.wikipedia.org/wiki/.DS_Store
- https://citation-file-format.github.io/#/what-is-a-citation-cff-file
- https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-citation-files
- https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/classifying-your-repository-with-topics
- https://devguide.ropensci.org/grooming.html#github-repo-topics
- https://docs.github.com/en/account-and-profile/setting-up-and-managing-your-personal-account-on-github/managing-access-to-your-personal-repositories/inviting-collaborators-to-a-personal-repository
- https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-readmes
- https://www.makeareadme.com/
- https://docs.python-guide.org/writing/documentation/
- https://devguide.ropensci.org/building.html#readme
- https://github.com/frictionlessdata/frictionless-r/#readme
- https://docs.ropensci.org/frictionless/
- https://docs.python-guide.org/writing/documentation/#restructuredtext-ref
- https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax
- https://docs.github.com/en/actions/monitoring-and-troubleshooting-workflows/adding-a-workflow-status-badge
- https://shields.io/badges
- https://dev.to/cicirello/badges-tldr-for-your-repositorys-readme-3oo3
- https://www.repostatus.org/
- https://lifecycle.r-lib.org/articles/stages.html
- https://data.research.cornell.edu/data-management/sharing/readme/
- https://datadryad.org/stash/best_practices#describe-your-dataset-in-a-readme-file
- https://www.contributor-covenant.org/
- https://docs.github.com/en/get-started/exploring-projects-on-github/finding-ways-to-contribute-to-open-source-on-github
- https://opensource.guide/
- https://opensource.guide/code-of-conduct/
- https://docs.github.com/en/communities/setting-up-your-project-for-healthy-contributions/adding-a-code-of-conduct-to-your-project
- https://docs.github.com/en/account-and-profile/managing-subscriptions-and-notifications-on-github/setting-up-notifications/about-notifications#default-subscriptions
- https://docs.github.com/en/get-started/quickstart/github-glossary#mention
- https://docs.github.com/en/account-and-profile/managing-subscriptions-and-notifications-on-github/managing-subscriptions-for-activity-on-github/managing-your-subscriptions
- https://docs.github.com/en/get-started/using-github/github-flow
- https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/managing-protected-branches/managing-a-branch-protection-rule
- https://docs.github.com/en/communities/setting-up-your-project-for-healthy-contributions/setting-guidelines-for-repository-contributors
- https://gist.github.com/peterdesmet/e90a1b0dc17af6c12daf6e8b2f044e7c
- https://tidyverse.tidyverse.org/CONTRIBUTING.html
- https://docs.github.com/en/issues/tracking-your-work-with-issues/creating-an-issue
- https://dev.to/opensauced/how-to-write-a-good-issue-tips-for-effective-communication-in-open-source-5443
- https://code-review.tidyverse.org/issues/
- https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/configuring-issue-templates-for-your-repository
- https://desktop.github.com/
- https://docs.github.com/en/desktop/installing-and-authenticating-to-github-desktop/setting-up-github-desktop
- https://semver.org/
- https://docs.github.com/en/repositories/releasing-projects-on-github/managing-releases-in-a-repository
- https://inbo.github.io/tutorials/tutorials/git_zenodo/
- https://devguide.ropensci.org/releasing.html#news
- https://cran.r-project.org/package=sp
- https://cran.r-project.org/package=rgdal
- https://cran.r-project.org/package=maptools
- https://rspatial.org/raster/pkg/1-introduction.html
- https://cran.r-project.org/package=rgeos
- https://r-spatial.github.io/sf/
- https://rspatial.github.io/terra/reference/terra-package.html
- https://devguide.ropensci.org/building.html#recommended-scaffolding
- https://style.tidyverse.org/
- https://covr.r-lib.org/
- https://testthat.r-lib.org/
- https://rstudio.github.io/shinytest/
- https://doi.org/10.1002/9781118448908
- https://devguide.ropensci.org/
- https://doi.org/10.5281/zenodo.6619350
- https://swcarpentry.github.io/r-novice-gapminder/
- http://doi.org/10.5281/zenodo.3265164
- https://data.agu.org/resources/introduction-to-open-science-agu
- https://www.britishecologicalsociety.org/wp-content/uploads/2019/06/BES-Guide-Reproducible-Code-2019.pdf
- https://doi.org/10.1371/journal.pcbi.1008770
- https://doi.org/10.1371/journal.pcbi.1003285
- https://r4ds.hadley.nz/
- https://bookdown.org/yihui/rmarkdown/
- https://data.agu.org/resources/r-guidance-agu-authors
- https://support.posit.co/hc/en-us/articles/200526207-Using-RStudio-Projects
- https://www.britishecologicalsociety.org/wp-content/uploads/2018/12/BES-Reproducible-Code.pdf
- https://r-pkgs.org/workflow101.html#benefits-of-rstudio-projects
- https://swcarpentry.github.io/r-novice-gapminder/02-project-intro.html
- https://devguide.ropensci.org/building.html#pkgdependencies
- https://httr2.r-lib.org/
- https://jeroen.r-universe.dev/curl
- https://docs.ropensci.org/crul/
- https://arxiv.org/abs/1403.2805
- https://xml2.r-lib.org/
- https://cran.r-project.org/web/packages/rgdal/index.html
- https://cran.r-project.org/web/packages/rgeos/index.html
- https://cran.r-project.org/web/packages/maptools/index.html
- https://www.r-bloggers.com/2023/06/upcoming-changes-to-popular-r-packages-for-spatial-data-what-you-need-to-do/
- https://r-spatial.org/r/2023/04/10/evolution3.html
- https://r-spatial.org/r/2022/04/12/evolution.html
- https://readr.tidyverse.org/
- https://recology.info/2018/10/limiting-dependencies/
- https://r-pkgs.org/dependencies-mindset-background.html#sec-dependencies-pros-cons
- https://simplystatistics.org/posts/2015-11-06-how-i-decide-when-to-trust-an-r-package/
- https://www.tidyverse.org/blog/2022/09/playing-on-the-same-team-as-your-dependecy/
- https://usethis.r-lib.org/
- https://github.com/r-lib/styler
- https://lintr.r-lib.org/
- https://devguide.ropensci.org/building.html#code-style
- https://r4ds.hadley.nz/workflow-style
- https://r-pkgs.org/testing-basics.html
- https://devguide.ropensci.org/building.html#testing
- https://mtlynch.io/good-developers-bad-tests/
- https://books.ropensci.org/http-testing/
- https://www.manning.com/books/unit-testing
- https://testthat.r-lib.org/articles/snapshotting.html
- https://vdiffr.r-lib.org/
- https://ggplot2.tidyverse.org/
- https://ggplot2.tidyverse.org/reference/ggplot_build.html?q=layer_data#details
- https://www.tidyverse.org/blog/2022/09/playing-on-the-same-team-as-your-dependecy/#testing-testing
- https://docs.ropensci.org/pkgcheck/
- https://inbo.github.io/checklist/
- https://en.wikipedia.org/wiki/Static_program_analysis
- http://mangothecat.github.io/goodpractice/
- https://russhyde.github.io/dupree/
- https://roxygen2.r-lib.org/
- https://www.njtierney.com/post/2023/11/10/how-to-get-good-with-r
- https://r4ds.had.co.nz/functions.html#functions
- https://style.tidyverse.org/functions.html
- https://r-pkgs.org/code.html#sec-code-organising
- https://rstudio.github.io/rstudio-extensions/rstudio_snippets.html
- https://design.tidyverse.org/inputs-explicit.html
- https://r4ds.had.co.nz/functions.html
- https://www.njtierney.com/post/2023/11/10/how-to-get-good-with-r/#write-functions
- http://design.tidyverse.org
- https://www.stat.berkeley.edu/~statcur/Workshop2/Presentations/functions.pdf
- http://swcarpentry.github.io/swc-releases/2017.08/r-novice-inflammation/02-func-R/
- https://en.wikipedia.org/wiki/Code_refactoring
- https://en.wikipedia.org/wiki/Don%27t_repeat_yourself
- https://startup-cto.medium.com/moist-code-why-code-should-not-be-completely-dry-1f06f2d31c31
- https://kentcdodds.com/blog/aha-programming
- https://enterprisecraftsmanship.com/posts/dry-damp-unit-tests/
- https://en.wikipedia.org/wiki/Rubber_duck_debugging
- https://en.wikipedia.org/wiki/Extract,_transform,_load
- https://inbo.github.io/coding-club/sessions/20230926_functions_in_r.html#1
- https://youtu.be/7oyiPBjLAWY?feature=shared
- https://yihui.org/en/2018/06/cache-invalidation/
- https://style.tidyverse.org/functions.html#naming
- https://style.tidyverse.org/syntax.html#object-names
- https://design.tidyverse.org/important-args-first.html
- https://design.tidyverse.org/required-no-defaults.html
- https://magrittr.tidyverse.org/
- https://design.tidyverse.org/defaults-short-and-sweet.html
- https://design.tidyverse.org/enumerate-options.html
- https://design.tidyverse.org/boolean-strategies.html
- https://roxygen2.r-lib.org/articles/roxygen2.html
- https://forcats.tidyverse.org/
- https://r-pkgs.org/man.html
- https://devguide.ropensci.org/building.html#roxygen2-use
- https://github.com/r-lib/pkgdown
- https://r-pkgs.org/
- https://cran.r-project.org/doc/manuals/r-release/R-exts.html
- https://adv-r.hadley.nz
- https://devguide.ropensci.org/building.html#naming-your-package
- https://r-lib.github.io/available/
- https://www.njtierney.com/post/2018/06/20/naming-things/
- https://yihui.org/en/2017/12/typing-names/
- https://codemeta.github.io/
- https://docs.ropensci.org/codemetar/index.html#why-create-a-codemetajson-for-your-package
- https://devguide.ropensci.org/building.html#console-messages
- https://cli.r-lib.org/
- https://github.com/hadley/assertthat
- https://cli.r-lib.org/reference/cli_abort.html
- https://github.com/frictionlessdata/frictionless-r/commit/aad0cd8e894a5a556d2a197348ba9169c267a55b
- https://usethis.r-lib.org/reference/use_readme_rmd.html
- https://usethis.r-lib.org/reference/badges.html
- https://usethis.r-lib.org/reference/use_github_action.html
- https://pkgdown.r-lib.org/articles/pkgdown.html
- https://enpiar.com/2017/11/21/getting-down-with-pkgdown/
- https://devguide.ropensci.org/building.html#website
- https://ropensci.org/blog/2018/10/08/orcid/
- https://r-pkgs.org/description.html
- https://cran.rstudio.com/doc/manuals/r-release/R-exts.html#The-DESCRIPTION-file
- https://devguide.ropensci.org/building.html#citation-file
- https://docs.ropensci.org/cffr/reference/cff_gha_update.html
- https://docs.ropensci.org/cffr/
- https://ropensci.org/blog/2021/11/16/how-to-cite-r-and-r-packages/
- https://mit-license.org/
- https://r-pkgs.org/license.html
- https://dplyr.tidyverse.org/
- https://style.tidyverse.org/files.html
- https://pytest-cov.readthedocs.io/en/
- https://docs.pytest.org/en/latest
- https://docs.readthedocs.io/en/stable/intro/getting-started-with-sphinx.html
- https://docs.python-guide.org/
- https://virtualenv.pypa.io/en/stable/user_guide.html
- https://conda.io/projects/conda/en/latest/user-guide/index.html
- https://pip.pypa.io/en/stable/user_guide/#requirements-files
- https://peps.python.org/pep-0008/
- https://pypi.org/project/pycodestyle/
- https://docs.pytest.org/en/latest/how-to/index.html#how-to
- https://packaging.python.org/
- https://packaging.python.org/en/latest/guides/publishing-package-distribution-releases-using-github-actions-ci-cd-workflows/
- https://docs.b-cubed.eu/
- https://github.com/b-cubed-eu/documentation/tree/main/tutorials
- https://docs.ropensci.org/frictionless/articles/frictionless.html

My little script

It also identifies headers and internal links, but I didn't actually check those.

# Test all links on a webpage

## GOAL 1 : list all links
## GOAL 2: test all links

# for the anatomy of a url, I referred to
# https://www.netmeister.org/blog/urls.html

# load libraries ----------------------------------------------------------

library(rvest)
library(httr2)
library(purrr)
library(dplyr)

# set domain to test ------------------------------------------------------

home_url <- "https://docs.b-cubed.eu/dev-guide/"
hostname <- "https://docs.b-cubed.eu"

get_links <- function(url){
  rvest::read_html(url) %>%
    rvest::html_nodes("a") %>%
    rvest::html_attr("href")
}

home_links <- get_links(home_url)
# internal links ----------------------------------------------------------

# should be prefixed with the hostname
prefix_with_hostname <-
  function(pathname, hostname = "https://docs.b-cubed.eu") {
    paste0(hostname, pathname)
  }

## all pages of the dev guide ---------------------------------------------
pages_to_test <-
  home_links[stringr::str_starts(home_links, stringr::fixed("/dev-guide/"))] %>%
  prefix_with_hostname()

# check if url resolves ---------------------------------------------------

# Return error when url does not resolve
check_url <- function(url) {
  httr2::request(url) %>%
    httr2::req_retry(max_tries = 3, max_seconds = 2) %>%
    httr2::req_user_agent(string = "r_link_checker") %>%
    httr2::req_perform() %>%
    httr2::resp_check_status()
}

# get links of the dev guide ----------------------------------------------

all_links <-
  purrr::map(pages_to_test, get_links) %>%
  purrr::set_names(pages_to_test)

# if it starts with # it's a link within the same path, if it starts with / it's
# a link starting from the hostname

prefix_with_page_url <- function(pathname, page_url){
  paste0(page_url,pathname)
}

# convert to a tibble with a column for url, is_heading for #, is_internal_link
# for internal links, and is_external_link for external links

identify_links <- function(links_vector) {
  dplyr::tibble(
    url = links_vector,
    is_heading = stringr::str_starts(url, stringr::fixed("#")),
    is_interal_link = stringr::str_starts(url, stringr::fixed("/")),
    is_external_link = !(is_heading | is_interal_link)
  )
}

# extract the external links only

external_links <-
  purrr::map(all_links, identify_links) %>%
  purrr::map_dfr(~dplyr::filter(.x, is_external_link)) %>%
  dplyr::distinct(url) %>%
  dplyr::pull(url)

# test the external links -------------------------------------------------

external_links_test_result <-
  purrr::map(external_links, purrr::safely(check_url),
           .progress = TRUE) %>%
  purrr::set_names(external_links)

external_links_broken <-
  purrr::keep(external_links_test_result,
              function(x) {
                !is.null(purrr::chuck(x, "error"))
              })

# print a little report ---------------------------------------------------

cli::cli_h2("The following external links are broken:")
cli::cli_li(items = names(external_links_broken))
peterdesmet commented 5 months ago

I've added a small .gitkeep file to the tutorials directory, so that directory becomes available: https://github.com/b-cubed-eu/documentation/tree/main/tutorials

That means there are no more broken links. Thanks for checking!