duttashi / learnr

Exploratory, Inferential and Predictive data analysis. Feel free to show your :heart: by giving a star :star:
MIT License
78 stars 56 forks source link

How to collapse rows with same identifier and retain non-empty column values? #47

Closed duttashi closed 5 years ago

duttashi commented 5 years ago

This question was originally asked on SO

Question

How to collapse (or merge?) rows with the same identifier and retain the non-empty (here, any nonzero values) values in each column?

Data

df = data.frame(produce = c("apples","apples", "bananas","bananas"),
                grocery1=c(0,1,1,1),
                grocery2=c(1,0,1,1),
                grocery3=c(0,0,1,1))

Desired output

 shopping grocery1 grocery2 grocery3
1   apples        1        1        0
2  bananas        1        1        1
duttashi commented 5 years ago

Solution

Using base R aggregate we can do

aggregate(.~produce, df, function(x) +any(x > 0))

#  produce grocery1 grocery2 grocery3
#1  apples        1        1        0
#2 bananas        1        1        1

Or using dplyr

library(dplyr)
df %>%
  group_by(produce) %>%
  summarise_all(~+any(. > 0))

#  produce grocery1 grocery2 grocery3
#  <fct>      <int>    <int>    <int>
#1 apples         1        1        0
#2 bananas        1        1        1

and same with data.table

library(data.table)
setDT(df)[, lapply(.SD, function(x) +any(x > 0)), by=produce]