Closed pbulsink closed 6 years ago
traceback()
9: stop("more columns than column names")
8: read.table(file = file, header = header, sep = sep, quote = quote,
dec = dec, fill = fill, comment.char = comment.char, ...)
7: utils::read.csv(text = httr::content(html, as = "text", type = "text/csv",
encoding = encoding), nrows = nrows, strip.white = TRUE,
skip = skip, header = header, colClasses = "character", check.names = FALSE)
6: eval(lhs, parent, parent)
5: eval(lhs, parent, parent)
4: utils::read.csv(text = httr::content(html, as = "text", type = "text/csv",
encoding = encoding), nrows = nrows, strip.white = TRUE,
skip = skip, header = header, colClasses = "character", check.names = FALSE) %>%
dplyr::mutate_at(.vars = dplyr::vars(dplyr::ends_with("Flag")),
dplyr::funs(gsub("^I$", "^", .)))
3: weather_raw(station_id = s, date = date_range[i], interval = interval,
skip = skip, url = url, encoding = encoding)
2: rbind(w, weather_raw(station_id = s, date = date_range[i], interval = interval,
skip = skip, url = url, encoding = encoding))
1: weather_dl(station_ids = 51423, start = "2018-03-31", end = "2018-04-01")
Thanks for the comprehensive testing! I can see where the problem is (the preamble or metadata changes in the files between pre and post April 2018). I've got a couple of ideas for addressing the problem which will hopefully make the package more robust to future changes. Will try them out and let you know how it goes.
Okay, so the newest version on the dev-steffi branch should have fixed this issue. Could you let me know if it works for you? Note that this will result in a new column (station operator) in data that includes a date after April 1st 2018.
Seems to work on my end too. Thanks for your quick help!
One more error in the same vein: When downloading from multiple station ID's with old and new data (such as 4291 and 27534, where the latter 'supersceded' the former for data collection in 1997), the rbind fails as they have different numbers of columns. Example code:
weather_dl(c(4291, 27534), interval = 'day')
And the traceback:
traceback()
4: stop("numbers of columns of arguments do not match")
3: rbind(deparse.level, ...)
2: rbind(w_all, w)
1: weathercan::weather_dl(c(4291, 27534), interval = "day", start = "1996-01-01")
A proposed fix could be using dplyr::bind_rows() instead of base::rbind() as it quietly and automatically fills missing columns with NA
values.
Ah thanks for catching that, I overlooked that part of the testing. Will fix it now.
Okay, columns are added for older data missing the new meta data columns (filled with NA if missing). This should fix the problem when joining new and old data between stations. Let me know how it goes!
Seems to work great now! Thanks
One thing to consider: the builtin kamloops
and kamloops_day
datasets don't have a station_operator
column. When rebuilding for master they should maybe be updated, but this may introduce breaking changes to your or other's testing code.
They should definitely be updated, as should the Read me and Vignettes so they match the current state of the package. I don't think they'll break any tests (that I know of). Thanks for the reminder!
Expected Behavior
weather_dl() should always return a data frame.
Current Behavior
Sometimes, weather_dl() reports
Steps to Reproduce (for bugs)
kam <- weather_dl(station_ids = 51423, start = "2018-03-31", end = "2018-04-01")
kam2 <- weather_dl(station_ids=42203, start = "2018-03-31", end = "2018-04-01")
kempt <- weather_dl(station_ids = 27534, start = "2018-03-31", end = "2018-04-01")
Any other date range is ok:
kam <- weather_dl(station_ids = 51423, start = "2018-03-01", end = "2018-03-31")
kam <- weather_dl(station_ids = 51423, start = "2018-04-01", end = "2018-04-30")
kam <- weather_dl(station_ids = 51423, start = "2017-03-31", end = "2017-04-01")
Possible Solution
Comparing the column names of both dates retrieved individually (2018-03-31, 2018-04-01) returns no differences with testthat::expect_equivalent().
Your Environment
Session info ------------------------------------------------------------------------------------------------------------- setting value
version R version 3.5.0 (2018-04-23) system x86_64, mingw32
ui RStudio (1.1.383)
language (EN)
collate English_Canada.1252
tz America/New_York
date 2018-05-29
Packages ----------------------------------------------------------------------------------------------------------------- package version date source
assertthat 0.2.0 2017-04-11 CRAN (R 3.5.0)
base 3.5.0 2018-04-23 local
bindr 0.1.1 2018-03-13 CRAN (R 3.5.0)
bindrcpp 0.2.2 2018-03-29 CRAN (R 3.5.0)
compiler 3.5.0 2018-04-23 local
curl 3.2 2018-03-28 CRAN (R 3.5.0)
datasets 3.5.0 2018-04-23 local
devtools 1.13.5 2018-02-18 CRAN (R 3.5.0)
digest 0.6.15 2018-01-28 CRAN (R 3.5.0)
dplyr 0.7.5 2018-05-19 CRAN (R 3.5.0)
glue 1.2.0 2017-10-29 CRAN (R 3.5.0)
graphics 3.5.0 2018-04-23 local
grDevices 3.5.0 2018-04-23 local
httr 1.3.1 2017-08-20 CRAN (R 3.5.0)
lubridate 1.7.4 2018-04-11 CRAN (R 3.5.0)
magrittr 1.5 2014-11-22 CRAN (R 3.5.0)
memoise 1.1.0 2017-04-21 CRAN (R 3.5.0)
methods 3.5.0 2018-04-23 local
pillar 1.2.1 2018-02-27 CRAN (R 3.5.0)
pkgconfig 2.0.1 2017-03-21 CRAN (R 3.5.0)
purrr 0.2.4 2017-10-18 CRAN (R 3.5.0)
R6 2.2.2 2017-06-17 CRAN (R 3.5.0)
Rcpp 0.12.16 2018-03-13 CRAN (R 3.5.0)
rlang 0.2.0 2018-02-20 CRAN (R 3.5.0)
stats 3.5.0 2018-04-23 local
stringi 1.1.7 2018-03-12 CRAN (R 3.5.0)
stringr 1.3.1 2018-05-10 CRAN (R 3.5.0)
testthat 2.0.0 2017-12-13 CRAN (R 3.5.0)
tibble 1.4.2 2018-01-22 CRAN (R 3.5.0)
tidyr 0.8.1 2018-05-18 CRAN (R 3.5.0)
tidyselect 0.2.4 2018-02-26 CRAN (R 3.5.0)
tools 3.5.0 2018-04-23 local
utils 3.5.0 2018-04-23 local
weathercan 0.2.6 2018-05-29 Github (ropensci/weathercan@3b8d291) withr 2.1.2 2018-03-15 CRAN (R 3.5.0)
xml2 1.2.0 2018-01-24 CRAN (R 3.5.0)
yaml 2.1.18 2018-03-08 CRAN (R 3.5.0)