reconhub / linelist

An R package to import, clean, and store case data
https://www.repidemicsconsortium.org/linelist
Other
25 stars 5 forks source link

guess_dates has unexpected behavior for future dates #65

Closed thibautjombart closed 5 years ago

thibautjombart commented 5 years ago

There are two issues here:

  1. inconsistent handling of future date depending on whether they are character or Date (see below; the Date case in particular gives a cryptic message
  2. future dates should be allowed if people want them; can we have an argument allow_future defaulting to FALSE, which will set dates in the future to NA, and otherwise process them the same way as other dates?
> linelist::guess_dates("2019-09-10")
[1] "2019-09-10"
Warning message:
In linelist::guess_dates("2019-09-10") : 
The following dates were not in the correct timeframe (1969-03-28 -- 2019-03-28):

  original    |  parsed    
  -----       |  -----     
  2019-09-10  |  2019-09-10
  NA          |  NA        
> linelist::guess_dates(as.Date("2019-09-10"))
Error in stack.default(bd) : at least one vector element is required
> 
zkamvar commented 5 years ago
  1. inconsistent handling of future date depending on whether they are character or Date

Good catch! Sending a Date vector to the function is a bit weird behavior and the function should just return a Date in the case of Date or POSIXt.

  1. future dates should be allowed if people want them; can we have an argument allow_future defaulting to FALSE, which will set dates in the future to NA, and otherwise process them the same way as other dates?

Agreed that future dates should be set to NA instead of being processed. I don't think allow_future is necessary considering that the last_date argument can be set to one year from now: last_date = Sys.Date() + 365.

zkamvar commented 5 years ago

Addendum: future dates are indeed set to NA when the number of errors are within the tolerance level.

# returns character
suppressWarnings(linelist::guess_dates("2019-09-10"))
#> [1] "2019-09-10"

# returns NA Date
suppressWarnings(linelist::guess_dates("2019-09-10", error_tol = 1))
#> [1] NA

# returns Date
linelist::guess_dates("2019-09-10", error_tol = 1, last_date = "2020-01-01")
#> [1] "2019-09-10"

Created on 2019-04-05 by the reprex package (v0.2.1)