markfairbanks / tidytable

Tidy interface to 'data.table'
https://markfairbanks.github.io/tidytable/
Other
449 stars 33 forks source link

`semi_join.(copy = T)` argument is missing #564

Closed exsell-jc closed 2 years ago

exsell-jc commented 2 years ago

When doing a semi join, the columns do not automatically get copied over. In dplyr, the default is copy = F. When copy = T is set to true, the column(s) from table y should be added into the table x.

markfairbanks commented 2 years ago

Do you have an example? As far as I know copy is only useful if dealing with remote tables.

markfairbanks commented 2 years ago

From your description I don't think you're looking for copy = TRUE. Even in dplyr that doesn't bring the columns from y over.

pacman::p_load(tidytable, dplyr)

x <- tidytable(x = c("a", "a", "b", "c"), y = c(1, 1, 2, 3))
y <- tidytable(x = c("a", "b"), z = 1:2)

x %>%
  dplyr::semi_join(y)
#> Joining, by = "x"
#> # A tidytable: 3 × 2
#>   x         y
#>   <chr> <dbl>
#> 1 a         1
#> 2 a         1
#> 3 b         2

x %>%
  dplyr::semi_join(y, copy = TRUE)
#> Joining, by = "x"
#> # A tidytable: 3 × 2
#>   x         y
#>   <chr> <dbl>
#> 1 a         1
#> 2 a         1
#> 3 b         2

Are you looking for an inner_join.()?

pacman::p_load(tidytable, dplyr)

x <- tidytable(x = c("a", "a", "b", "c"), y = c(1, 1, 2, 3))
y <- tidytable(x = c("a", "b"), z = 1:2)

x %>%
  dplyr::inner_join(y)
#> Joining, by = "x"
#> # A tidytable: 3 × 3
#>   x         y     z
#>   <chr> <dbl> <int>
#> 1 a         1     1
#> 2 a         1     1
#> 3 b         2     2

x %>%
  inner_join.(y)
#> # A tidytable: 3 × 3
#>   x         y     z
#>   <chr> <dbl> <int>
#> 1 a         1     1
#> 2 a         1     1
#> 3 b         2     2

I'm going to close this for now because I think copy = TRUE isn't what you're looking for, but we can keep discussing here to figure it out.