ropensci / visdat

Preliminary Exploratory Visualisation of Data
https://docs.ropensci.org/visdat/
Other
450 stars 47 forks source link

vis_dat() can't handle new tibble name repair conventions #113

Closed sharlagelfand closed 1 year ago

sharlagelfand commented 5 years ago

There are new name repair conventions as of tibble 2.0.0 that are exposed in other packages (e.g., readxl). It renames blank column names ("") to the form of ..1, ..2, etc: https://tibble.tidyverse.org/reference/name-repair.html

Unfortunately it looks like vis_dat() can't handle this kind of column name 😭

library(tibble)
library(visdat)

x <- tibble(`..1` = 1)

x
#> # A tibble: 1 x 1
#>     ..1
#>   <dbl>
#> 1     1

vis_dat(x)
#> Error in .f(.x[[i]], ...): ..1 used in an incorrect context, no ... to look in

Created on 2019-02-27 by the reprex package (v0.2.1.9000)

Session info ``` r devtools::session_info() #> - Session info ---------------------------------------------------------- #> setting value #> version R version 3.5.1 (2018-07-02) #> os Windows 10 x64 #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate English_Canada.1252 #> ctype English_Canada.1252 #> tz America/New_York #> date 2019-02-27 #> #> - Packages -------------------------------------------------------------- #> package * version date lib #> assertthat 0.2.0 2017-04-11 [1] #> backports 1.1.3 2018-12-14 [1] #> callr 2.0.4 2018-05-15 [1] #> cli 1.0.1 2018-09-25 [1] #> crayon 1.3.4 2017-09-16 [1] #> desc 1.2.0 2018-05-01 [1] #> devtools 2.0.1 2018-10-26 [1] #> digest 0.6.18 2018-10-10 [1] #> dplyr 0.8.0.9000 2019-02-19 [1] #> evaluate 0.11 2018-07-17 [1] #> fansi 0.4.0 2018-10-05 [1] #> fs 1.2.6 2018-08-23 [1] #> glue 1.3.0 2018-07-17 [1] #> htmltools 0.3.6 2017-04-28 [1] #> knitr 1.20 2018-02-20 [1] #> magrittr 1.5 2014-11-22 [1] #> memoise 1.1.0 2017-04-21 [1] #> pillar 1.3.1 2018-12-15 [1] #> pkgbuild 1.0.2 2018-10-16 [1] #> pkgconfig 2.0.2 2018-08-16 [1] #> pkgload 1.0.2 2018-10-29 [1] #> prettyunits 1.0.2 2015-07-13 [1] #> processx 3.1.0 2018-05-15 [1] #> purrr 0.2.5 2018-05-29 [1] #> R6 2.3.0 2018-10-04 [1] #> Rcpp 1.0.0 2018-11-07 [1] #> remotes 2.0.2 2018-10-30 [1] #> rlang 0.3.1 2019-01-08 [1] #> rmarkdown 1.11.7 2019-02-22 [1] #> rprojroot 1.3-2 2018-01-03 [1] #> sessioninfo 1.1.1 2018-11-05 [1] #> stringi 1.2.4 2018-07-20 [1] #> stringr 1.3.1 2018-05-10 [1] #> testthat 2.0.0 2017-12-13 [1] #> tibble * 2.0.99.9000 2019-02-19 [1] #> tidyr 0.8.2 2018-10-28 [1] #> tidyselect 0.2.5 2018-10-11 [1] #> usethis 1.4.0 2018-08-14 [1] #> utf8 1.1.4 2018-05-24 [1] #> visdat * 0.5.3.9000 2019-02-27 [1] #> withr 2.1.2 2018-03-15 [1] #> yaml 2.2.0 2018-07-25 [1] #> source #> CRAN (R 3.5.1) #> CRAN (R 3.5.2) #> CRAN (R 3.5.1) #> CRAN (R 3.5.2) #> CRAN (R 3.5.1) #> CRAN (R 3.5.1) #> CRAN (R 3.5.2) #> CRAN (R 3.5.2) #> Github (tidyverse/dplyr@22b923e) #> CRAN (R 3.5.1) #> CRAN (R 3.5.2) #> CRAN (R 3.5.1) #> CRAN (R 3.5.1) #> CRAN (R 3.5.1) #> CRAN (R 3.5.1) #> CRAN (R 3.5.1) #> CRAN (R 3.5.1) #> CRAN (R 3.5.2) #> CRAN (R 3.5.2) #> CRAN (R 3.5.2) #> CRAN (R 3.5.2) #> CRAN (R 3.5.1) #> CRAN (R 3.5.1) #> CRAN (R 3.5.1) #> CRAN (R 3.5.2) #> CRAN (R 3.5.1) #> CRAN (R 3.5.2) #> CRAN (R 3.5.2) #> Github (rstudio/rmarkdown@45146bf) #> CRAN (R 3.5.1) #> CRAN (R 3.5.2) #> CRAN (R 3.5.2) #> CRAN (R 3.5.1) #> CRAN (R 3.5.1) #> Github (tidyverse/tibble@5a6e727) #> CRAN (R 3.5.2) #> CRAN (R 3.5.2) #> CRAN (R 3.5.1) #> CRAN (R 3.5.1) #> Github (ropensci/visdat@0ec8392) #> CRAN (R 3.5.1) #> CRAN (R 3.5.1) #> #> [1] C:/Users/shg/Documents/R/win-library/3.5 #> [2] C:/Program Files/R/R-3.5.1/library ```

I dug into vis_dat() a bit and the issue is within vis_gather_(), specifically with tidyr::gather_(). gather_() is deprecated so that would probably be why it doesn't support this new naming convention. Just wanted to flag it to you!

Edit: I actually don't really know if this is a "you" problem or a tidyr problem (e.g. can gather(), the "regular" function, handle .. names at all?)

sharlagelfand commented 5 years ago

FYI: https://github.com/tidyverse/tidyr/issues/559

njtierney commented 5 years ago

Thanks for posting this issue, Sharla!

Indeed, even after the new release of tibble this still breaks with names that have two dots, e.g., (..1) - demonstration building from your lovely reprex:

library(tidyr)
library(tibble)

x <- tibble(`..1` = 1, 
            `..2` = 2,
            x3 = 3)

# Gathering without specific reference works

x %>%
  gather(key = "col_name", 
         value = "value")
#> # A tibble: 3 x 2
#>   col_name value
#>   <chr>    <dbl>
#> 1 ..1          1
#> 2 ..2          2
#> 3 x3           3

library(visdat)

one_dot <- tibble(`.1` = 1)
two_dot <- tibble(`..1` = 1)
three_dot <- tibble(`...1` = 1)

vis_dat(one_dot)

vis_dat(three_dot)

vis_dat(two_dot)
#> Error in .f(.x[[i]], ...): ..1 used in an incorrect context, no ... to look in

Created on 2019-03-25 by the reprex package (v0.2.1)

I'll have a look at updating gather_ and have this fixed for the next release.

Thanks again for digging into this, I really appreciate it! :)

sharlagelfand commented 5 years ago

I think that it may not work because.. names are not syntactic -- seems like they are a regretful introduction by the tidyverse team that has now been fixed! There's lots of good tidbits in this chapter if you haven't seen it: https://principles.tidyverse.org/names-attribute.html 2.3 on especially

so i'm not sure if gather_ can even be fixed to allow them (but the introduction of them, which was happening via tibble, shouldn't happen anymore)

njtierney commented 1 year ago

I think this is now resolved, as you said, @sharlagelfand - the .. names aren't syntactic anymore, so I think it's all good from here :)

library(visdat)
library(tidyverse)

one_dot <- tibble(`.1` = 1)
two_dot <- tibble(`..1` = 1)
#> Error:
#> ! Column 1 must not have names of the form ... or ..j.
#> Use .name_repair to specify repair.
#> Caused by error in `repaired_names()`:
#> ! Names can't be of the form `...` or `..j`.
#> βœ– These names are invalid:
#>   * "..1" at location 1.

#> Backtrace:
#>      β–†
#>   1. └─tibble::tibble(..1 = 1)
#>   2.   └─tibble:::tibble_quos(xs, .rows, .name_repair)
#>   3.     └─tibble:::set_repaired_names(output, repair_hint = TRUE, .name_repair = .name_repair)
#>   4.       β”œβ”€rlang::set_names(...)
#>   5.       └─tibble:::repaired_names(names2(x), repair_hint, .name_repair = .name_repair, quiet = quiet)
#>   6.         β”œβ”€tibble:::subclass_name_repair_errors(...)
#>   7.         β”‚ └─base::withCallingHandlers(...)
#>   8.         └─vctrs::vec_as_names(...)
#>   9.           └─vctrs (local) `<fn>`()
#>  10.             └─vctrs:::validate_unique(names = names, arg = arg, call = call)
#>  11.               └─vctrs:::stop_names_cannot_be_dot_dot(names, call = call)
#>  12.                 └─vctrs:::stop_names(...)
#>  13.                   └─vctrs:::stop_vctrs(...)
#>  14.                     └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = vctrs_error_call(call))
three_dot <- tibble(`...1` = 1)

vis_dat(one_dot)

vis_dat(two_dot)
#> Error in "lapply(text, glue_cmd, .envir = .envir)": ! Could not evaluate cli `{}` expression: `glue::glue_collapse…`.
#> Caused by error in `glue::glue_collapse(class(x), sep = ", ", last = ", and ")`:
#> ! object 'two_dot' not found
vis_dat(three_dot)

Created on 2022-10-21 with reprex v2.0.2

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.2.1 (2022-06-23) #> os macOS Monterey 12.3.1 #> system aarch64, darwin20 #> ui X11 #> language (EN) #> collate en_AU.UTF-8 #> ctype en_AU.UTF-8 #> tz Australia/Perth #> date 2022-10-21 #> pandoc 2.19.2 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.2.0) #> backports 1.4.1 2021-12-13 [1] CRAN (R 4.2.0) #> broom 1.0.1 2022-08-29 [1] CRAN (R 4.2.0) #> cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.2.0) #> cli 3.4.1 2022-09-23 [1] CRAN (R 4.2.0) #> colorspace 2.0-3 2022-02-21 [1] CRAN (R 4.2.0) #> crayon 1.5.2 2022-09-29 [1] CRAN (R 4.2.0) #> curl 4.3.3 2022-10-06 [1] CRAN (R 4.2.0) #> DBI 1.1.3 2022-06-18 [1] CRAN (R 4.2.0) #> dbplyr 2.2.1 2022-06-27 [1] CRAN (R 4.2.0) #> digest 0.6.30 2022-10-18 [1] CRAN (R 4.2.0) #> dplyr * 1.0.10 2022-09-01 [1] CRAN (R 4.2.0) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.0) #> evaluate 0.17 2022-10-07 [1] CRAN (R 4.2.0) #> fansi 1.0.3 2022-03-24 [1] CRAN (R 4.2.0) #> farver 2.1.1 2022-07-06 [1] CRAN (R 4.2.0) #> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.2.0) #> forcats * 0.5.2 2022-08-19 [1] CRAN (R 4.2.0) #> fs 1.5.2 2021-12-08 [1] CRAN (R 4.2.0) #> gargle 1.2.1 2022-09-08 [1] CRAN (R 4.2.0) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.2.0) #> ggplot2 * 3.3.6 2022-05-03 [1] CRAN (R 4.2.0) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0) #> googledrive 2.0.0 2021-07-08 [1] CRAN (R 4.2.0) #> googlesheets4 1.0.1 2022-08-13 [1] CRAN (R 4.2.0) #> gtable 0.3.1 2022-09-01 [1] CRAN (R 4.2.0) #> haven 2.5.1 2022-08-22 [1] CRAN (R 4.2.0) #> highr 0.9 2021-04-16 [1] CRAN (R 4.2.0) #> hms 1.1.2 2022-08-19 [1] CRAN (R 4.2.0) #> htmltools 0.5.3 2022-07-18 [1] CRAN (R 4.2.0) #> httr 1.4.4 2022-08-17 [1] CRAN (R 4.2.0) #> jsonlite 1.8.2 2022-10-02 [1] CRAN (R 4.2.0) #> knitr 1.40 2022-08-24 [1] CRAN (R 4.2.0) #> labeling 0.4.2 2020-10-20 [1] CRAN (R 4.2.0) #> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.2.0) #> lubridate 1.8.0 2021-10-07 [1] CRAN (R 4.2.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0) #> mime 0.12 2021-09-28 [1] CRAN (R 4.2.0) #> modelr 0.1.9 2022-08-19 [1] CRAN (R 4.2.0) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.2.0) #> pillar 1.8.1 2022-08-19 [1] CRAN (R 4.2.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0) #> purrr * 0.3.5 2022-10-06 [1] CRAN (R 4.2.0) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.2.0) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.2.0) #> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.2.0) #> R.utils 2.12.0 2022-06-28 [1] CRAN (R 4.2.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.0) #> readr * 2.1.3 2022-10-01 [1] CRAN (R 4.2.0) #> readxl 1.4.1 2022-08-17 [1] CRAN (R 4.2.0) #> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.2.0) #> rlang 1.0.6 2022-09-24 [1] CRAN (R 4.2.0) #> rmarkdown 2.17 2022-10-07 [1] CRAN (R 4.2.0) #> rstudioapi 0.14 2022-08-22 [1] CRAN (R 4.2.0) #> rvest 1.0.3 2022-08-19 [1] CRAN (R 4.2.0) #> scales 1.2.1 2022-08-20 [1] CRAN (R 4.2.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0) #> stringi 1.7.8 2022-07-11 [1] CRAN (R 4.2.0) #> stringr * 1.4.1 2022-08-20 [1] CRAN (R 4.2.0) #> styler 1.7.0 2022-03-13 [1] CRAN (R 4.2.0) #> tibble * 3.1.8 2022-07-22 [1] CRAN (R 4.2.0) #> tidyr * 1.2.1 2022-09-08 [1] CRAN (R 4.2.0) #> tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.2.0) #> tidyverse * 1.3.2 2022-07-18 [1] CRAN (R 4.2.0) #> tzdb 0.3.0 2022-03-28 [1] CRAN (R 4.2.0) #> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.2.0) #> vctrs 0.4.2 2022-09-29 [1] CRAN (R 4.2.0) #> visdat * 0.6.0.9000 2022-10-21 [1] local #> withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.0) #> xfun 0.33 2022-09-12 [1] CRAN (R 4.2.0) #> xml2 1.3.3 2021-11-30 [1] CRAN (R 4.2.0) #> yaml 2.3.5 2022-02-21 [1] CRAN (R 4.2.0) #> #> [1] /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library #> #> ────────────────────────────────────────────────────────────────────────────── ```