Closed ghost closed 4 years ago
Thanks. That needs fixing. Flagged for the next release. Might have to change the argument name too as start_date
is misleading. Or maybe iterate over days since start_date
and add an end_date
argument to align functionality with documentation. If end_date
defaults to start_date
if Null
all current functionality will be retained.
Thoughts @dshkol?
Agree, this should be fixed. The functionality describe is the functionality intended.
I have a fixed-up version in a new branch. https://github.com/mountainMath/cansim/tree/change_tables_fix This should address the problem by adding an optional end_date
argument. If end_date
is specified it will iterate through all dates between start and end date.
Please check to see if that solves the problem, install with
remotes::install_github("mountainmath/cansim@change_tables_fix")
I've tested these two:
get_cansim_changed_tables(start_date = "2018-08-01", end_date = "2019-08-01")
# A tibble: 7,342 x 2
productId releaseTime
<int> <chr>
1 23100251 2018-08-01T08:35
2 33100036 2018-08-01T08:30
3 10100139 2018-08-01T08:30
4 10100125 2018-08-01T08:30
5 10100107 2018-08-01T08:30
6 33100005 2018-08-01T08:30
7 33100033 2018-08-01T08:30
8 33100084 2018-08-01T08:30
9 11100008 2018-08-02T08:35
10 33100163 2018-08-02T08:30
# … with 7,332 more rows
get_cansim_changed_tables(start_date = "2020-09-07", end_date = "2020-09-14")
# A tibble: 97 x 2
productId releaseTime
<int> <chr>
1 10100136 2020-09-08T08:30
2 10100142 2020-09-08T08:30
3 10100108 2020-09-08T08:30
4 33100036 2020-09-08T08:30
5 10100139 2020-09-08T08:30
6 18100204 2020-09-08T08:30
7 36100042 2020-09-08T08:30
8 36100431 2020-09-08T08:30
9 36100430 2020-09-08T08:30
10 18100160 2020-09-08T08:30
# … with 87 more rows
Both work, tho the first one is pretty slow. I wonder if there's a risk of bounced by the api if you run an especially long range?
We should also add a safety check for the inputs if they're the wrong order.
get_cansim_changed_tables(start_date = "2020-09-14", end_date = "2020-09-07")
Error in seq.int(0, to0 - from, by) : wrong sign in 'by' argument
Just a quick check of the input args that end_date >= start_date
# A tibble: 12 x 2
productId releaseTime
<int> <chr>
1 10100158 2020-09-14T08:30
2 10100142 2020-09-14T08:30
3 33100036 2020-09-14T08:30
4 10100139 2020-09-14T08:30
5 32100359 2020-09-14T08:30
6 12100120 2020-09-14T08:30
7 18100248 2020-09-14T08:30
8 14100332 2020-09-14T08:30
9 14100331 2020-09-14T08:30
10 14100358 2020-09-14T08:30
11 14100357 2020-09-14T08:30
12 36100491 2020-09-14T08:30
Works ok too.
I added a check on the order and switch if it's mixed up, while also emitting a message informing the user. And added a check for querying time intervals longer than 31 days and emit a message warning the user that it may take long.
I've pulled this into the dev branch for 0.3.6 and this will be fixed with the next release.
The documentation for the function
get_cansim_changed_tables(start_date)
states that it will return "list of tables that have been modified or updated since the specified date". However, the function only returns tables that were changed onstart_date
, not those that have been changed since the specified date.This isn't a big issue because a set of dates can be looped over to get the tables changed over a range of dates. However, this is less elegant than the functionality described in the function's documentation.
Here's what I get if I run the example in the documentation:
Based on the documentation I would expect to see all the tables changed from 2018-08-01 till today.