TysonStanley / tidyfast

Fast and efficient alternatives to tidyr functions built on data.table #rdatatable #rstats
https://tysonbarrett.com/tidyfast/
187 stars 4 forks source link

Make all methods S3-methods, for easier integration with other packages e.g. disk.frame #9

Closed xiaodaigh closed 4 years ago

xiaodaigh commented 4 years ago

I want to support tidyfast's verbs with disk.frame but I would need the verbs to be S3-compatible.

E.g. your code would beccome

dt_count <- function(dt, ....) {
   UseMethod("dt_count")
}

dt_count.default <- function(dt_, ..., na.rm = FALSE, wt = NULL){

  if (isFALSE(is.data.table(dt_)))
    .dt <- as.data.table(dt_)

  dots <- substitute(list(...))
  wt <- substitute(wt)

  if (na.rm)
    dt_ <- dt_[complete.cases(dt_)]

  if (!is.null(wt))
    return(dt_[, list(N = sum(eval(wt))), keyby = eval(dots)])

  dt_[, .N, keyby = eval(dots)]
}

So in disk.frame, I would implement

dt_count.disk.frame(dt, ...) {
...
}

That way, we can enable tidyfast verbs to be used with large on-disk datasets :)

Relevant issue: https://github.com/xiaodaigh/disk.frame/issues/218

TysonStanley commented 4 years ago

Oh, that'd be great having support from {diskframe}. That'll be the next thing I work on. Will update you when ready.

TysonStanley commented 4 years ago

Should be ready for you!

xiaodaigh commented 4 years ago

WIP PR to integrate tidyfast with disk.frame

https://github.com/xiaodaigh/disk.frame/pull/220