reconhub / incidence

☣:chart_with_upwards_trend::chart_with_downwards_trend:☣ Compute and visualise incidence
https://reconhub.github.io/incidence
Other
58 stars 13 forks source link

Allow yyyy-mm-dd formatted character input #101

Closed zkamvar closed 5 years ago

zkamvar commented 5 years ago

This reverts #90 and further addresses #88 by only allowing a narrow set of valid inputs for character data. This also re-instates first_date and last_date's ability to take character data, preventing a breaking change.

This will also throw an error if the user attempts to pass a date that is incorrectly formatted. The benefit here is that it alerts the user to where the problem is as opposed to as.Date(), which would convert something like 84-01-01 to mean the year 0084 🙄.

library("incidence")
dats <- Sys.Date() + sample(-100:100, 5)
datc <- as.character(dats)
datc
#> [1] "2019-03-11" "2018-12-07" "2019-04-17" "2019-04-09" "2019-04-10"
incidence(datc)
#> <incidence object>
#> [5 cases from days 2018-12-07 to 2019-04-17]
#> 
#> $counts: matrix with 132 rows and 1 columns
#> $n: 5 cases in total
#> $dates: 132 dates marking the left-side of bins
#> $interval: 1 day
#> $timespan: 132 days
#> $cumulative: FALSE

# Dates with letters in them
datc[3] <- "1Q84-11-24"
incidence(datc)
#> Error in incidence.character(datc): Not all dates are in ISO 8601 standard format (yyyy-mm-dd). The first incorrect date is 1Q84-11-24

# Incorrectly formatted
datc[3] <- "84-11-24"
incidence(datc)
#> Error in incidence.character(datc): Not all dates are in ISO 8601 standard format (yyyy-mm-dd). The first incorrect date is 84-11-24

# Impossible dates
datc[3] <- "1984-24-11"
incidence(datc)
#> Error in incidence.character(datc): Not all dates are in ISO 8601 standard format (yyyy-mm-dd). The first incorrect date is 1984-24-11

Created on 2019-01-09 by the reprex package (v0.2.1)

zkamvar commented 5 years ago

Note: Appveyor failing is basically it's default state.

zkamvar commented 5 years ago

I would be interested to get a review from either @thibautjombart or @jrcpulliam since this affects the original issue and subsequent PR that y'all worked on.

zkamvar commented 5 years ago

Also, I would like to have this merged within the week so that I can submit it to CRAN to prevent it from getting kicked off.

zkamvar commented 5 years ago

Note that I will merge this by Wednesday, 16 January regardless if anyone else looks at it due to the CRAN timeline.

thibautjombart commented 5 years ago

Sorry I am running around like a headless chicken. One thing that is not shown in the examples above is:

as.Date("01-01-2001")
## [1] "1-01-20"

How is this handled by the new code?

zkamvar commented 5 years ago

@thibautjombart, you missed the new incidence.character() method, which explicitly checks for ISO 8601 date format before it's ever passed to as.Date().

https://github.com/reconhub/incidence/pull/101/files#diff-c6f51f32e05e72ba23d4fc3d7d948d12R212

Here's how it behaves with ambiguous dates: If there is even one item in the vector improperly formatted, the function will bork.

incidence::incidence("02-01-2001")
#> Error in incidence.character("02-01-2001"): Not all dates are in ISO 8601 standard format (yyyy-mm-dd). The first incorrect date is 02-01-2001
incidence::incidence("02-01-01")
#> Error in incidence.character("02-01-01"): Not all dates are in ISO 8601 standard format (yyyy-mm-dd). The first incorrect date is 02-01-01

Created on 2019-01-12 by the reprex package (v0.2.1)

When I first saw the change to disallow characters, I thought it was a bit overkill since it's fairly trivial to write the "[0-9]{4}-[0-9]{2}-[0-9]{2}" regex to check the incoming characters first.

zkamvar commented 5 years ago

As this was approved by Thibaut in person, I will merge this branch.