`mutate(row_number())` fails on a 0 row data frame when overwriting existing column #639

Closed jfdesomzee closed 1 year ago

jfdesomzee commented 1 year ago


I have an error when I create a variable with an existing name in a data table with 0 rows. Any idea how I could make this works?

iris %>%
  tidytable::as_tidytable() %>%
  tidytable::filter(FALSE) %>% 
#> # A tidytable: 0 x 6
#> # ... with 6 variables: Sepal.Length <dbl>, Sepal.Width <dbl>,
#> #   Petal.Length <dbl>, Petal.Width <dbl>, Species <fct>, Sepal.Length2 <list>

iris %>%
  tidytable::as_tidytable() %>%
  tidytable::filter(1:.N<5) %>% 
#> # A tidytable: 4 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#>          <int>       <dbl>        <dbl>       <dbl> <fct>  
#> 1            1         3.5          1.4         0.2 setosa 
#> 2            2         3            1.4         0.2 setosa 
#> 3            3         3.2          1.3         0.2 setosa 
#> 4            4         3.1          1.5         0.2 setosa

iris %>%
  tidytable::as_tidytable() %>%
  tidytable::filter(FALSE) %>% 
#> Error:
#> ! Can't recycle input of size 2 to size 0.

markfairbanks commented 1 year ago

I don't know if this is a bug - this errors in dplyr as well:

library(dplyr, warn.conflicts = FALSE)

df <- tibble(x = integer(), y = character())

df %>%
  dplyr::mutate(x = 1:n())
#> Error in `dplyr::mutate()`:
#> ℹ In argument: `x = 1:n()`.
#> Caused by error:
#> ! `x` must be size 0 or 1, not 2.

markfairbanks commented 1 year ago

I'm going to close this - I don't think this is something that should be changed if it also fails in dplyr.

If you have any other questions around it let me know.

markfairbanks commented 1 year ago

I think the recommended way to deal with this would be to pass a vector of length 0 to mutate():

iris %>%
  as_tidytable() %>%
  filter(FALSE) %>%
#> # A tidytable: 0 × 5
#> # … with 5 variables: Sepal.Length <int>, Sepal.Width <dbl>,
#> #   Petal.Length <dbl>, Petal.Width <dbl>, Species <fct>
markfairbanks commented 1 year ago

Also FYI - you don't need to use require(magrittr) if you're only using it for the %>% pipe. tidytable reexports %>%.

jfdesomzee commented 1 year ago

this does not fail

library(dplyr, warn.conflicts = FALSE)

df <- tibble(x = integer(), y = character())

df %>%
  dplyr::mutate(x = row_number())
#> # A tibble: 0 x 2
#> # ... with 2 variables: x <int>, y <chr>
markfairbanks commented 1 year ago

Ah gotcha - 1:n() fails because it's essentially calling 1:0, which is a vector of length 2. Whereas dplyr's row_number() is doing something different in the background.

In tidytable row_number() does basically call 1:n().

markfairbanks commented 1 year ago

All set.


df <- tidytable(x = integer(), y = character())

df %>%
  mutate(x = row_number())
#> # A tidytable: 0 × 2
#> # … with 2 variables: x <int>, y <chr>