Displayr / flipTime

Tools for manipulating and presenting time series data
8 stars 4 forks source link

flipTime::AsDate() fails to produce results when date formats in vector are not identical #2

Closed tcwilkinson closed 4 years ago

tcwilkinson commented 4 years ago

This happens when the format of the input vector is not identical: Reproducible examples...

These work:

> flipTime::AsDate(c("2000","2001"))
[1] "2000-01-01" "2001-01-01"
> flipTime::AsDate(c("21-Jun-2000","12-Jun-1999"))
[1] "2000-06-21" "1999-06-12"

This fails:

flipTime::AsDate(c("2000","12-Jun-1999"),on.parse.failure="warn")
[1] NA NA
Warning message:
In handleParseFailure(deparse(substitute(x)), length(x), on.parse.failure) :
  Could not parse c("2000", "12-Jun-1999") into a valid date in any format.

Obviously vectorising AsDate() is an option, as workaround, but it's an ugly clunk:

> vectorizedAsDate <- Vectorize(flipTime::AsDate)
> vectorizedAsDate(c("2000","12-Jun-1999"),on.parse.failure="warn")
       2000 12-Jun-1999 
      10957       10754 
> as.Date(vectorizedAsDate(c("2000","12-Jun-1999"),on.parse.failure="warn"),origin= "1970-01-01")
        2000  12-Jun-1999 
"2000-01-01" "1999-06-12" 
> as.Date(as.vector(vectorizedAsDate(c("2000","12-Jun-1999"),on.parse.failure="warn")),origin= "1970-01-01")
[1] "2000-01-01" "1999-06-12"
as.Date(as.vector(vectorizedAsDate(c("2000","12-Jun-1999","silly date"),on.parse.failure="warn")),origin= "1970-01-01")
[1] "2000-01-01" "1999-06-12" NA 

Would be great if this could be handled internally to AsDate().

chschan commented 4 years ago

Thanks for your comments and the examples, but AsDate is actually doing what its supposed to. The function tries to use information from other values in the the vector to guess the format of the dates. For example, this will give you warnings about ambiguous date formats: AsDate(c("01-01-2010")), but not AsDate(c("01-01-2010", "13-01-2010"))

As you say, one alternative is to vectorize it. You might also want to look at the anytime package. We had some trouble with it when the format is ambiguous (i.e. could be US or international format) but it is very flexible.