jhudsl / OTTR_Template

OTTR for making courses! This is a template repo that helps people write 1 course but publish it in three places
Creative Commons Attribution 4.0 International
17 stars 13 forks source link

URL checker for GitHub pages domains doesn't fail in the way we'd expect #563

Closed cansavvy closed 2 years ago

cansavvy commented 2 years ago

See here for example: https://github.com/jhudsl/OTTR_Template_Website/actions/runs/3092279525/jobs/5003367582

This URL: https://jhudatascience.org/intro_to_r2/ is broken in the meaningful sense, but according to the test we use:

test_url <- function(url) {
  message(paste0("Testing: ", url))
  url_status <- try(httr::GET(url), silent = TRUE)
  status <- ifelse(suppressMessages(grepl("Could not resolve host", url_status)), "failed", "success")

It does find the host of the website so it doesn't classify it as a broken URL.

So, we need to think if there's a way to catch these types of URLs as well, especially since GitHub URLs are probably a lot of what we reference.

cansavvy commented 2 years ago

@carriewright11 Feel free to add to this issue if there's something else I missed.

carriewright11 commented 2 years ago

This one also didn't get caught by the check: https://bit.ly/ITCR_2023, so perhaps this a more general 404 issue?

carriewright11 commented 2 years ago

hopefully we can also catch issues with pages not being set up correctly with _site.yml ... currently it appears that the just show up as a 404 if you click on that tab