You can, for example, pass a national URL to get_subnational_data() and no error is raised. The function manages to extract data and theregion column gets filled with Mobility Report en.pdf (because this variable is filled using a str_split() index).
Example
Passing the GB PDF to get_subregion_data().
get_subregion_data("https://www.gstatic.com/covid19/mobility/2020-04-05_GB_Mobility_Report_en.pdf")
## A tibble: 900 x 6
# date country region location entity value
# <chr> <chr> <chr> <chr> <chr> <dbl>
# 1 2020-04-05 GB Mobility Report en.pdf Aberdeen City retail_recr -0.84
# ...
Solution
Detect the input as the path to a national or regional file. Could be based on the number of str_split() elements, but this will depend on the consistency of the URL format.
length(str_split("2020-04-05_US_Alabama_Mobility_Report_en.pdf", "_")[[1]]) # 6 elements
length(str_split("2020-04-05_GB_Mobility_Report_en.pdf", "_")[[1]]) # 5 elements
Or perhaps there's an element in the PDFs themselves that can help identify whether it's national or subnational.
Risk
Minimal. Perhaps only a problem if a third party uses the function incorrectly.
Problem
You can, for example, pass a national URL to
get_subnational_data()
and no error is raised. The function manages to extract data and theregion
column gets filled withMobility Report en.pdf
(because this variable is filled using astr_split()
index).Example
Passing the GB PDF to
get_subregion_data()
.Solution
Detect the input as the path to a national or regional file. Could be based on the number of
str_split()
elements, but this will depend on the consistency of the URL format.Or perhaps there's an element in the PDFs themselves that can help identify whether it's national or subnational.
Risk
Minimal. Perhaps only a problem if a third party uses the function incorrectly.