TysonStanley / tidyfast

Fast and efficient alternatives to tidyr functions built on data.table #rdatatable #rstats
https://tysonbarrett.com/tidyfast/
187 stars 4 forks source link

names_pattern / names_sep / names_to vector? #39

Open tdhock opened 4 years ago

tdhock commented 4 years ago

hi @TysonStanley are there any plans to implement some of the more advanced/new arguments/features of tidyr::pivot_longer, or or you planning to stop adding new features? I'm wondering because I would like to accurately discuss your package in my R Journal article which compares various methods for data reshaping. For example, here is an example with names_sep and a names_to vector,

> one.iris <- datasets::iris[1,]
> tidyr::pivot_longer(one.iris, 1:4, names_to=c("part", "dim"), names_sep="[.]")
# A tibble: 4 x 4
  Species part  dim    value
  <fct>   <chr> <chr>  <dbl>
1 setosa  Sepal Length   5.1
2 setosa  Sepal Width    3.5
3 setosa  Petal Length   1.4
4 setosa  Petal Width    0.2
> tidyr::pivot_longer(one.iris, 1:4, names_to=c(".value", "dim"), names_sep="[.]")
# A tibble: 2 x 4
  Species dim    Sepal Petal
  <fct>   <chr>  <dbl> <dbl>
1 setosa  Length   5.1   1.4
2 setosa  Width    3.5   0.2
> tidyr::pivot_longer(one.iris, 1:4, names_to=c("part", ".value"), names_sep="[.]")
# A tibble: 2 x 4
  Species part  Length Width
  <fct>   <chr>  <dbl> <dbl>
1 setosa  Sepal    5.1   3.5
2 setosa  Petal    1.4   0.2

I tried the same with your package but I got an error

> tidyfast::dt_pivot_longer(one.iris, 1:4, names_to=c("part", ".value"), names_sep="[.]")
Error in melt.data.table(data = dt_, id.vars = id_vars, measure.vars = cols,  : 
  'variable.name' must be a character/integer vector of length=1.

these features should be possible for you to implement with current data.table, but for a more memory/time-efficient solution you may want to wait until my PR https://github.com/Rdatatable/data.table/pull/4731 is merged.

TysonStanley commented 4 years ago

Hi @tdhock, as I mentioned in #40, I definitely want to integrate these features. The plan is to replicate tidyr as close as possible including the names arguments. But my workload at work is not making this feasible super soon.

And thank you for your PR to data.table. Those slight changes make a huge difference.

tdhock commented 4 years ago

ok that is useful to know, thanks. If you have time please code review that PR and leave suggestions / comments for improvement (you / tidyfast would probably be one of the first users of this new functionality).

TysonStanley commented 4 years ago

By the way, I LOVE that PR for data.table!! I don't have time right now to investigate it thoroughly, but the functionality (user-facing) is fantastic and would very easily translate to dt_pivot_longer(). I'm curious what Matt and the team will think but I personally love it.

tdhock commented 4 years ago

thanks! seems like Matt doesn't have much time to review/merge PRs these days but when he does, it would help if you left a comment there so they would know that there is at least one other person who likes it.