echasnovski / keyholder

Store Data About Rows
https://echasnovski.github.io/keyholder/
Other
8 stars 2 forks source link

Automatically use keys in joins #1

Closed hadley closed 6 years ago

hadley commented 6 years ago

i.e. instead of defaulting to the intersection of all variables, it would make more sense to default to the intersection of keys

echasnovski commented 6 years ago

Thanks, that is indeed interesting. The main goal of this package is to invisibly track rows after application of some user defined function. For example:

library(dplyr)
library(keyholder)

modify <- function(.tbl) {
  .tbl %>%
    filter(vs == 1) %>%
    arrange(mpg)
}

mtcars %>%
  use_id() %>%
  modify() %>%
  pull_key(.id)
#>  [1] 11  6 10  4 32 21  3  9  8 26 19 28 18 20

Using intersection of keys by default can break this (if there is a join used without setting by). However, supplying this as an option can be useful.