tidyverse / tibble

A modern re-imagining of the data frame
https://tibble.tidyverse.org/
Other
663 stars 131 forks source link

add_column where .data is a tibble with 0 columns produces an unexpected, undocumented result #319

Closed dan87134 closed 6 years ago

dan87134 commented 6 years ago

I've posted this to the https://community.rstudio.com/c/tidyverse discussion in answer to a question I had asked about why add_column could not add a column to a tibble that had 0 columns (tibble())

I've come to the conclusion that add_column is producing an undocumented and unexpected result when .data (the input tibble) has 0 columns.

Here is why I think this...

I checked the source for add_column and it checks for a number of edge cases, for example where the number of added columns is 0, but the case where .data (i.e. the input tibble) has 0 columns is not checked.

This produces a the confusing error "Error: 0 is not TRUE", which I would call a programming surprise because does not look to be intentional :-) .

This happens because later in the code pluralise_n is used with nrow(.data), i.e. pluralise_n is used with 0,.to make an error message. pluralise_n produces the low level message ""Error: 0 is not TRUE" when it is asked to pluralize 0.

However, after a bit of checking and setup, all that add_column does is concatenate the added columns with the ones already in the tibble. This make sense because a tibble stores it's content as a sequence of columns.

So... it would make sense (IMHO of course) for add_column to check to see if the .data (the input tibble) contained 0 columns and if it did just return the input columns (i.e ...) as a new tibble. I think that this kind of behavior would be more consistent with name and implied behavior of a function named add_column.

Supporting to a 0 column .data could be added with a simple test at the beginning of the code.

existing code

if (ncol(df) == 0L) { return(.data) }

add this to support 0 columns .data

if(ncol(.data)==0L) { return(df) }

krlmlr commented 6 years ago

Thanks. I've only skimmed over the original thread, and I think that add_column() should never change the number of rows. The error message was bad, I fixed that.

I would consider a PR that adds a function that constructs a zero-column tibble with a fixed number of rows.

Note that the new tidyeval framework permits much more elegant solutions:

library(tibble)
x <- list(a = 1:3, b = 4:6)
tibble(!!! x)
#> # A tibble: 3 x 2
#>       a     b
#>   <int> <int>
#> 1     1     4
#> 2     2     5
#> 3     3     6
dan87134 commented 6 years ago

Thanks for the pointer to triple wack. It cleans up things for me... it's less typing than do.call :-) I really have to dig into tidyeval now.

github-actions[bot] commented 3 years ago

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.