tidyverse / dplyr

dplyr: A grammar of data manipulation
https://dplyr.tidyverse.org/
Other
4.79k stars 2.12k forks source link

Combined slice_head and slice_tails into one function #7078

Closed james-kilgour closed 2 months ago

james-kilgour commented 3 months ago

There's currently no quick way to extract both the first and last rows of a grouped data frame. This is useful if, for example, you want to identify the limits from a range of values in an ordered sequence.

Currently, the workflow looks something like:

### Option one:

df_1 <- df %>%
        group_by(v1, v2) %>%
        arrange(v1, v2) %>%
        slice_head()

df_2 <- df %>%
        group_by(v1, v2) %>%
        arrange(v1, v2) %>%
        slice_tail()

data <- rbind(data_1, data_2)

### Option two:

df <- df %>%
        group_by(v1, v2) %>%
        arrange(v1, v2) %>%
        filter(row_number()==1 | row_number()==n()) 

The following code combines existing dplyr functionality to make this operation more intuitive:

slice_ends <- function(data){

    data_1 <- data %>%
        slice_head()

    data_2 <- data %>%
        slice_tail()

    rbind(data_1, data_2)

}

# To give:

df <- df %>%
        group_by(v1, v2) %>%
        arrange(v1, v2) %>%
        slice_ends()
ks8997 commented 3 months ago

Not sure if this is necessary. I like:

iris %>% 
  group_by(Species) %>% 
  slice(1, n())
etiennebacher commented 2 months ago

Just as a complement to @ks8997's nice answer, here's how to customize the number of head and tail rows:

library(dplyr, warn.conflicts = FALSE)

iris %>% 
  # Get the first 2 rows and the last 3 rows by group
  slice(1:2, (n() - 2):n(), .by = Species)
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
#> 1           5.1         3.5          1.4         0.2     setosa
#> 2           4.9         3.0          1.4         0.2     setosa
#> 3           4.6         3.2          1.4         0.2     setosa
#> 4           5.3         3.7          1.5         0.2     setosa
#> 5           5.0         3.3          1.4         0.2     setosa
#> 6           7.0         3.2          4.7         1.4 versicolor
#> 7           6.4         3.2          4.5         1.5 versicolor
#> 8           6.2         2.9          4.3         1.3 versicolor
#> 9           5.1         2.5          3.0         1.1 versicolor
#> 10          5.7         2.8          4.1         1.3 versicolor
#> 11          6.3         3.3          6.0         2.5  virginica
#> 12          5.8         2.7          5.1         1.9  virginica
#> 13          6.5         3.0          5.2         2.0  virginica
#> 14          6.2         3.4          5.4         2.3  virginica
#> 15          5.9         3.0          5.1         1.8  virginica