EdwinTh / padr

Padding of missing records in time series
https://edwinth.github.io/padr/
Other
132 stars 12 forks source link

thicken doesn't work with a tibble of one row #67

Closed Fuco1 closed 5 years ago

Fuco1 commented 5 years ago

I have a script which processes some dplyr tibbles. Sometimes the tibble only has one row and it seems thicken doesn't work on that:

> ii
# A tibble: 1 x 4
  sku_id date                quantity is_promo
   <int> <dttm>                 <dbl> <lgl>   
1   5562 2018-07-23 00:00:00      0.5 FALSE   
> ii %>% thicken('month')
Error in if (check_for_sorting(dt_var)) { : 
  missing value where TRUE/FALSE needed
> bind_rows(ii, ii) %>% thicken('month')
# A tibble: 2 x 5
  sku_id date                quantity is_promo date_month
   <int> <dttm>                 <dbl> <lgl>    <date>    
1   5562 2018-07-23 00:00:00      0.5 FALSE    2018-07-01
2   5562 2018-07-23 00:00:00      0.5 FALSE    2018-07-01
> 

Here's dump of the tibble

ii <-
structure(list(sku_id = 5562L, date = structure(1532304000, class = c("POSIXct", 
"POSIXt"), tzone = "GMT"), quantity = 0.5, is_promo = FALSE), row.names = c(NA, 
-1L), spec = structure(list(cols = list(sku_id = structure(list(), class = c("collector_integer", 
"collector")), date = structure(list(), class = c("collector_character", 
"collector")), quantity = structure(list(), class = c("collector_double", 
"collector")), is_promo = structure(list(), class = c("collector_integer", 
"collector"))), default = structure(list(), class = c("collector_guess", 
"collector")), skip = 1), class = "col_spec"), class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"))
EdwinTh commented 5 years ago

I cannot reproduce the error. First check, are you on a recent version of padr (v.0.4.0 or higher)?

Fuco1 commented 5 years ago

Any dependencies that can cause this? I digged in the code and I can't find where the function get_dt_var_and_name is defined (that's the one which generates the dt_var value).

[17:59:54]matus@thales:~/R/x86_64-pc-linux-gnu-library/3.5/padr
> cat DESCRIPTION
Package: padr
Type: Package
Title: Quickly Get Datetime Data Ready for Analysis
Version: 0.4.1
Author: Edwin Thoen
Maintainer: Edwin Thoen <edwinthoen@gmail.com>
Description: Transforms datetime data into a format ready for analysis.
    It offers two core functionalities; aggregating data to a higher level interval
    (thicken) and imputing records where observations were absent (pad).
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
Depends: R (>= 3.0.0)
Imports: Rcpp, dplyr (>= 0.6.0), lubridate, rlang
Suggests: ggplot2, testthat, knitr, rmarkdown, lazyeval, tidyr,
        data.table, lintr
RoxygenNote: 6.0.1
LinkingTo: Rcpp
VignetteBuilder: knitr
URL: https://github.com/EdwinTh/padr
BugReports: https://github.com/EdwinTh/padr/issues
ByteCompile: true
NeedsCompilation: yes
Packaged: 2018-06-26 10:12:45 UTC; edwinthoen
Repository: CRAN
Date/Publication: 2018-06-26 19:23:11 UTC
Built: R 3.5.2; x86_64-pc-linux-gnu; 2019-03-02 11:37:43 UTC; unix
EdwinTh commented 5 years ago

It was my mistake I could not reproduce, I was on the dev version. In the upcoming version the warning that the data frame is unordered will be removed from thicken. It is the checking of the order of the datetime variable that causes the problem. This bug will be automatically resolved because of it.

You can install the dev version with devtools::install_github("EdwinTh/padr", ref = "v.0.5.0")

Fuco1 commented 5 years ago

@EdwinTh Thanks!