tidyverse / funs

Collection of low-level functions for working with vctrs
Other
34 stars 7 forks source link

all_same() - check all elements in vector are the same #19

Closed 1danjordan closed 4 years ago

1danjordan commented 7 years ago

Ran into this a few times in the last week and thought it might belong here:

all_same(1:10)         # FALSE
all_same(rep(1,10))    # TRUE

Some old discussion here on the mailing list regarding this. I don't know how efficient the suggested all(x == x[1]) or whether it is suitable across all modes, but has been very handy in dplyr chains, trying to work out all the elements of a variable in a grouped dataframe are the same:

all_same <- function(x) all(x == x[1])

mtcars %>% 
  group_by(carb) %>%
  summarise(all_same(am))

##  A tibble: 6 x 2
#    carb `all_same(am)`
#   <dbl>          <lgl>
# 1     1          FALSE
# 2     2          FALSE
# 3     3           TRUE
# 4     4          FALSE
# 5     6           TRUE
# 6     8           TRUE
lionel- commented 7 years ago

This probably belongs in rlang (but would be reexported by vctrs).

1danjordan commented 7 years ago

Out of curiosity, does the simple x == x[1] predicate suffice, or would it require an S3 generic? To allow for tolerance in numerics for example.

lionel- commented 7 years ago

It makes sense to use R's default tolerance checking (which is the one from the C library).

hadley commented 5 years ago

I think the big question here is what the appropriate definition of equality is?

Should all_same(c(NA, NA)) be TRUE?

What about all_same(c(2, sqrt(2) ^ 2))? If you think that should be true, do you worry that all_same(x[1:2]) and all_same(x[2:3]) might be true, but not all_same(x)?

hadley commented 5 years ago

Related to #22

hadley commented 5 years ago

Also if you want to use FP tolerance, I think you might have all_same(x) == TRUE, but vec_unique_count(x) != 1 because there's no way to incorporate FP tolerance into a hash.

hadley commented 5 years ago

OTOH we might be able to use the bit twiddling trick that base radix sort uses: https://github.com/wch/r-source/blob/trunk/src/main/radixsort.c#L620-L649 (from data table)

hadley commented 4 years ago

Now at vctrs::vec_duplicate_all()