Closed andrew-edwards closed 8 months ago
Rcurl::url.exists()
seems to work (in Linux)
But a better way might be using package httr
like this:
x <-httr::GET("https://climatedataguide.ucar.edu/sites/default/files/2023-04/npindex_monthly.txt")
x$status_code
Look for status code 403, 404 etc or 200 means it exists You could even grab the data from this output object, doing it all in one call.
Thanks Chris - that should work.
Annoyingly, some of the websites for climatic indices contain the month and year in them, making it hard to automatically grab the latest calculations as the pathname or filename keeps changing.
For example: https://climatedataguide.ucar.edu/sites/default/files/2023-04/npindex_monthly.txt
A while ago I tried things like
So that function returns FALSE even though the site exists. Then I gave up and wrote a bit of a clunky manual way of doing it (checking each month going back from the current one to the last one saved; in future we plan to update these every two months, so not too onerous, but would still be nice to automate it). See here
https://github.com/pbs-assess/pacea/blob/f0e696e49fd6295637af4a153a52b5aec064068c/data-raw/coastwide-indices/coastwide-indices.R#L94
So.... anyone know of other functions to detect if a website exists or not? Then I could automate the looping back in time through months.