EdwinTh / padr

Padding of missing records in time series
https://edwinth.github.io/padr/
Other
132 stars 12 forks source link

Pad() returns an object missing error for data frames larger than 1 million rows #44

Closed danielsjf closed 7 years ago

danielsjf commented 7 years ago

It didn't work before (the version on CRAN returned a normal error message), but I guess one of the variables to generate the error message is now missing in the development version.

(reprex compatible)

library(tidyverse)
library(padr)
x <- expand.grid(date = ISOdate(2017:2025,1,1,0), item = 1:20) %>% group_by(item) %>% pad('hour')

Error:

Error in sprintf("Estimated %s returned rows, larger than %s milion in break_above",  : 
  object 'break_above' not found

Will you support more than 1 million lines in the future? I could live with longer process times.

danielsjf commented 7 years ago

I think the following lines: https://github.com/EdwinTh/padr/blob/ee118b995fbbcc72e9b5b1be2db7ae8562af038a/R/pad.R#L402-L406

Should become

  threshold <- threshold * 10 ^ 6
  if (n > threshold) {
    stop(sprintf("Estimated %s returned rows, larger than %s milion in break_above",
                 n, threshold), call. = FALSE)
  }

Only break_above on the last line should be replaced by threshold.

EdwinTh commented 7 years ago

Coincedently noticed it myself too this morning, thanks for the fix!

EdwinTh commented 7 years ago

"Will you support more than 1 million lines in the future? I could live with longer process times."

It is the parameter break_above, so if you want more rows you can.

danielsjf commented 7 years ago

My bad, I didn't see the exposed argument :-) Thanks!