Closed agilebean closed 3 years ago
This is happening because the 01
, 02
, etc. cells are character and the 10
, 11
, etc. cells are numeric.
You've got various options:
col_type
for this column (character, in this case).One day, there might be a way for a user to express "I never want list-columns" (#51), but that is not this day.
Thanks a lot for the feedback @jennybc !
And special thanks for the expression "that is not this day" for diplomatic linguistic expression that it cannot be implemented yet.
However, I would be grateful if you could give feedback to these two approaches:
- Specify a
col_type
for this column (character, in this case).
Is there a way (maybe regex) to *only specify the first column (as I have 60+ columns which are all imported correctly)?
- Convert the numbers to character and simplify the column on the R side.
I do this now by importing the id column *separately. Can you give a dplyr expression that could convert the list column to a character vector? I failed with unlist
and map_chr
library(tidyverse)
(dat <- tibble(x = list("01", "02", 10, 11)))
#> # A tibble: 4 × 1
#> x
#> <list>
#> 1 <chr [1]>
#> 2 <chr [1]>
#> 3 <dbl [1]>
#> 4 <dbl [1]>
dat %>%
mutate(x = map_chr(x, as.character))
#> # A tibble: 4 × 1
#> x
#> <chr>
#> 1 01
#> 2 02
#> 3 10
#> 4 11
Created on 2021-08-30 by the reprex package (v2.0.1.9000)
Thanks so much @jennybc deeply grateful!!!
@jennybc: May it be possible to reconsider the default conversion behavior of read_sheet()?
Reason: I just used read_sheet() again for some research data and realized that in any case when read_sheet() creates a list column, a user of read_sheet() would still be able to work with its content more easily if it was converted to a char vector.
Please see a comparison of the post processing of read_sheet() when
A) read_sheet() has imported mixed columns (char/num and NAs) as list columns if they were numeric columns
mutate(across(
where(is.list),
# list conversion to char leaves "NULL" entries > TRICKY: NULLs convert to NA
~ as.character(.x) %>%
na_if("NULL") %>%
as.numeric))
B) read_sheet() has imported mixed columns (char/num and NAs) as char:
mutate(across(
num_var_labels,
~ as.numeric))
Solution B implies the conversion na_if("NULL").
I'm suggesting Solution B because it's intuitive for anybody to understand the conversion to numeric, but Solution A is almost impossible to come up with for tidyverse beginners. The option to specify the data type of columns is nice but impractible for datasets with 100+ columns which is often the case in social science datasets.
In my google sheet, one column has single quoted entries for numbers 01 to 09, and unquoted entries for entries above 10. This mixed datatype is converted as list entries, i.e. each entry is a list, which makes it difficult to convert back because the conversion depends on the datatype buried in the first element of a list, so I can't use
unnest()
The previous versiongoogle_sheets::gs_read()
imported the same column correctly as character column.Question: Is there any way to avoid the conversion into a list per entry?
Constraints:
col_types
because there are over 60 columns which are imported correctly, so only this ID column has the list entry problem.