Open asfimport opened 4 years ago
Neal Richardson / @nealrichardson: If you wanted to explore this, one challenge I see is that pivot_longer and pivot_wider aren't generics, so you can't just make arrow methods for them.
Dominic Dennenmoser:
Thanks for refering to that. I've just looked for issues or pull-requests mention anything in that direction. Fortunately, a generic version of pivot_[longer|wider]()
will be available in the upcoming version of tidyr
, and is already implemented into the development version (#800).
Nigel McKernan:
The issue [~domiden]
references was committed into tidyr
1.1.0 back in May of 2020, as you can see [here](https://github.com/tidyverse/tidyr/releases#:~:text=pivot_longer()%20and%20pivot_wider()%20are%20now%20generic%20so%20implementations%0Acan%20be%20provided%20for%20objects%20other%20than%20data%20frames), more than 2 years ago.
Would it be possible now to incorporate some tidyr
methods that have been converted to generics into {}arrow{
}?
EDIT: As well, the nest()
generic is now [lazily-evaluated](https://github.com/tidyverse/tidyr/releases#:~:text=The%20nest()%20generic%20now%20avoids%20computing%20on%20.data%2C%20making%20it%20more%0Acompatible%20with%20lazy%20tibbles), making it easier to do remote operations, as of the tidyr
1.2.0 release earlier this year.
Related to #34265
I think it would be reasonable to implement an interface to the
tidyr
package. The implementation would allow to lazily process ArrowTables before put it back into the memory. However, currently you need to collect the table first before applying tidyr methods. The following code chunk shows an example routine:The main focus might be the following three methods:
tidyr::[un]nest()
,tidyr::pivot_[longer|wider]()
, andtidyr::seperate()
.I suppose the last two can be fairly quickly implemented, but will be accessible.
tidyr::nest()
andtidyr::unnest()
cannot be implement before conversion to ListReporter: Dominic Dennenmoser
Note: This issue was originally created as ARROW-8813. Please see the migration documentation for further details.