tidymodels / dials

Tools for creating tuning parameter values
https://dials.tidymodels.org/
Other
113 stars 27 forks source link

Add vctrs and dplyr support for `parameters` #114

Closed DavisVaughan closed 4 years ago

DavisVaughan commented 4 years ago

This PR adds vctrs and dplyr 1.0.0 support for the dials parameters subclass. I will do the param-grid one in a separate PR.

This currently relies on dev vctrs because of a bug in vec_rbind() that had to be fixed.


It works very similarly to https://github.com/tidymodels/rsample/pull/150 and revolves around the following four helpers (which also exist in rsample):

The first three helpers seem to be the common pieces when creating a tibble subclass (the 4th is just a combination of the other 3). The dplyr and vctrs methods use them to implement casting, common types, restoration, etc.


x is reconstructable to a parameters object, to, if the following invariants hold when comparing the two objects:

# Invariants:
# - Column order doesn't matter
# - Column names must be exactly the same after sorting them
# - Row order doesn't matter
# - Row presence / absence doesn't matter
#   - Caveat that the `$id` column cannot be duplicated
#   - Caveat that no rows can have `NA` values
# - Column types must be the same
#   - And `$object` must be a list of `param`s

These rules are somewhat different from rsample.


The setup is generally the same as with rsample, except for the fact that rsets were special because they had a tibble ptype (taking a 0-row slice fell back to tibble). That isn't the case here, taking a 0-row slice of a parameters tibble is still a parameters tibble. This is more inline with how vctrs extensions usually work, and we actually have vec_ptype2() methods that get used this time.


This PR also removes the info attribute from param-grids

juliasilge commented 4 years ago

I spent some time with these changes and feel good about almost all of it.

Everything that I tried works as I believe we expect except for changing column names using mutate(), related to "Column names must be exactly the same after sorting them" above. I see the same behavior with both old and new dplyr, which is... good? I think? Except that my feeling is it should fall back to a tibble if the columns are changed in this way.

Old dplyr

library(tidymodels)
#> ── Attaching packages ──────────────────────────────────────────────────────────────────────────── tidymodels 0.1.0 ──
#> ✓ broom     0.5.6          ✓ recipes   0.1.12    
#> ✓ dials     0.0.6          ✓ rsample   0.0.6     
#> ✓ dplyr     0.8.5          ✓ tibble    3.0.1     
#> ✓ ggplot2   3.3.0          ✓ tune      0.1.0     
#> ✓ infer     0.5.1          ✓ workflows 0.1.1     
#> ✓ parsnip   0.1.1.9000     ✓ yardstick 0.0.6     
#> ✓ purrr     0.3.4
#> ── Conflicts ─────────────────────────────────────────────────────────────────────────────── tidymodels_conflicts() ──
#> x purrr::discard()  masks scales::discard()
#> x dplyr::filter()   masks stats::filter()
#> x dplyr::lag()      masks stats::lag()
#> x ggplot2::margin() masks dials::margin()
#> x recipes::step()   masks stats::step()

trees_param <- parameters(trees(), min_n(), sample_size())

basic_af <- tibble(id = rep("trees", 3), hello = 1:3)

## look good
trees_param %>% mutate(hello = 1)
#> # A tibble: 3 x 7
#>   name        id          source component component_id object      hello
#>   <chr>       <chr>       <chr>  <chr>     <chr>        <list>      <dbl>
#> 1 trees       trees       list   unknown   unknown      <nparam[+]>     1
#> 2 min_n       min_n       list   unknown   unknown      <nparam[+]>     1
#> 3 sample_size sample_size list   unknown   unknown      <nparam[?]>     1
trees_param %>% arrange(id)
#> Collection of 3 parameters for tuning
#> 
#>           id parameter type object class
#>        min_n          min_n    nparam[+]
#>  sample_size    sample_size    nparam[?]
#>        trees          trees    nparam[+]
#> 
#> Parameters needing finalization:
#>    # Observations Sampled ('sample_size')
#> 
#> See `?dials::finalize` or `?dials::update.parameters` for more information.
trees_param %>% select(id)
#> # A tibble: 3 x 1
#>   id         
#>   <chr>      
#> 1 trees      
#> 2 min_n      
#> 3 sample_size
anti_join(trees_param, basic_af, by = "id")
#> Collection of 2 parameters for tuning
#> 
#>           id parameter type object class
#>        min_n          min_n    nparam[+]
#>  sample_size    sample_size    nparam[?]
#> 
#> Parameters needing finalization:
#>    # Observations Sampled ('sample_size')
#> 
#> See `?dials::finalize` or `?dials::update.parameters` for more information.
right_join(trees_param, basic_af, by = "id")
#> # A tibble: 3 x 7
#>   name  id    source component component_id object      hello
#>   <chr> <chr> <chr>  <chr>     <chr>        <list>      <int>
#> 1 trees trees list   unknown   unknown      <nparam[+]>     1
#> 2 trees trees list   unknown   unknown      <nparam[+]>     2
#> 3 trees trees list   unknown   unknown      <nparam[+]>     3
full_join(trees_param, basic_af, by = "id")
#> # A tibble: 5 x 7
#>   name        id          source component component_id object      hello
#>   <chr>       <chr>       <chr>  <chr>     <chr>        <list>      <int>
#> 1 trees       trees       list   unknown   unknown      <nparam[+]>     1
#> 2 trees       trees       list   unknown   unknown      <nparam[+]>     2
#> 3 trees       trees       list   unknown   unknown      <nparam[+]>     3
#> 4 min_n       min_n       list   unknown   unknown      <nparam[+]>    NA
#> 5 sample_size sample_size list   unknown   unknown      <nparam[?]>    NA
left_join(trees_param, basic_af, by = "id")
#> # A tibble: 5 x 7
#>   name        id          source component component_id object      hello
#>   <chr>       <chr>       <chr>  <chr>     <chr>        <list>      <int>
#> 1 trees       trees       list   unknown   unknown      <nparam[+]>     1
#> 2 trees       trees       list   unknown   unknown      <nparam[+]>     2
#> 3 trees       trees       list   unknown   unknown      <nparam[+]>     3
#> 4 min_n       min_n       list   unknown   unknown      <nparam[+]>    NA
#> 5 sample_size sample_size list   unknown   unknown      <nparam[?]>    NA

## this one surprises and concerns me: STILL PARAMETERS
trees_param %>% mutate(id = paste0(id, "_better"),
                       name = paste0(name, "_best"))
#> Collection of 3 parameters for tuning
#> 
#>                  id   parameter type object class
#>        trees_better       trees_best    nparam[+]
#>        min_n_better       min_n_best    nparam[+]
#>  sample_size_better sample_size_best    nparam[?]
#> 
#> Parameters needing finalization:
#>    # Observations Sampled ('sample_size_better')
#> 
#> See `?dials::finalize` or `?dials::update.parameters` for more information.

Created on 2020-05-20 by the reprex package (v0.3.0)

Session info ``` r devtools::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 3.6.2 (2019-12-12) #> os macOS Mojave 10.14.6 #> system x86_64, darwin15.6.0 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz America/Denver #> date 2020-05-20 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.0) #> backports 1.1.7 2020-05-13 [1] CRAN (R 3.6.2) #> base64enc 0.1-3 2015-07-28 [1] CRAN (R 3.6.0) #> bayesplot 1.7.1 2019-12-01 [1] CRAN (R 3.6.0) #> boot 1.3-25 2020-04-26 [1] CRAN (R 3.6.2) #> broom * 0.5.6 2020-04-20 [1] CRAN (R 3.6.2) #> callr 3.4.3 2020-03-28 [1] CRAN (R 3.6.2) #> class 7.3-17 2020-04-26 [1] CRAN (R 3.6.2) #> cli 2.0.2 2020-02-28 [1] CRAN (R 3.6.0) #> codetools 0.2-16 2018-12-24 [1] CRAN (R 3.6.2) #> colorspace 1.4-1 2019-03-18 [1] CRAN (R 3.6.0) #> colourpicker 1.0 2017-09-27 [1] CRAN (R 3.6.0) #> crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.0) #> crosstalk 1.1.0.1 2020-03-13 [1] CRAN (R 3.6.2) #> desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.0) #> devtools 2.3.0 2020-04-10 [1] CRAN (R 3.6.2) #> dials * 0.0.6 2020-05-20 [1] Github (tidymodels/dials@b876ade) #> DiceDesign 1.8-1 2019-07-31 [1] CRAN (R 3.6.0) #> digest 0.6.25 2020-02-23 [1] CRAN (R 3.6.0) #> dplyr * 0.8.5 2020-03-07 [1] CRAN (R 3.6.0) #> DT 0.13 2020-03-23 [1] CRAN (R 3.6.2) #> dygraphs 1.1.1.6 2018-07-11 [1] CRAN (R 3.6.0) #> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 3.6.2) #> evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.0) #> fansi 0.4.1 2020-01-08 [1] CRAN (R 3.6.0) #> fastmap 1.0.1 2019-10-08 [1] CRAN (R 3.6.0) #> foreach 1.5.0 2020-03-30 [1] CRAN (R 3.6.2) #> fs 1.4.1 2020-04-04 [1] CRAN (R 3.6.2) #> furrr 0.1.0 2018-05-16 [1] CRAN (R 3.6.0) #> future 1.17.0 2020-04-18 [1] CRAN (R 3.6.2) #> generics 0.0.2 2018-11-29 [1] CRAN (R 3.6.0) #> ggplot2 * 3.3.0 2020-03-05 [1] CRAN (R 3.6.0) #> ggridges 0.5.2 2020-01-12 [1] CRAN (R 3.6.0) #> globals 0.12.5 2019-12-07 [1] CRAN (R 3.6.0) #> glue 1.4.1 2020-05-13 [1] CRAN (R 3.6.2) #> gower 0.2.1 2019-05-14 [1] CRAN (R 3.6.0) #> GPfit 1.0-8 2019-02-08 [1] CRAN (R 3.6.0) #> gridExtra 2.3 2017-09-09 [1] CRAN (R 3.6.0) #> gtable 0.3.0 2019-03-25 [1] CRAN (R 3.6.0) #> gtools 3.8.2 2020-03-31 [1] CRAN (R 3.6.2) #> highr 0.8 2019-03-20 [1] CRAN (R 3.6.0) #> htmltools 0.4.0 2019-10-04 [1] CRAN (R 3.6.0) #> htmlwidgets 1.5.1 2019-10-08 [1] CRAN (R 3.6.0) #> httpuv 1.5.2 2019-09-11 [1] CRAN (R 3.6.0) #> igraph 1.2.5 2020-03-19 [1] CRAN (R 3.6.0) #> infer * 0.5.1 2019-11-19 [1] CRAN (R 3.6.0) #> inline 0.3.15 2018-05-18 [1] CRAN (R 3.6.0) #> ipred 0.9-9 2019-04-28 [1] CRAN (R 3.6.0) #> iterators 1.0.12 2019-07-26 [1] CRAN (R 3.6.0) #> janeaustenr 0.1.5 2017-06-10 [1] CRAN (R 3.6.0) #> knitr 1.28 2020-02-06 [1] CRAN (R 3.6.0) #> later 1.0.0 2019-10-04 [1] CRAN (R 3.6.0) #> lattice 0.20-41 2020-04-02 [1] CRAN (R 3.6.2) #> lava 1.6.7 2020-03-05 [1] CRAN (R 3.6.2) #> lhs 1.0.2 2020-04-13 [1] CRAN (R 3.6.2) #> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 3.6.0) #> listenv 0.8.0 2019-12-05 [1] CRAN (R 3.6.0) #> lme4 1.1-23 2020-04-07 [1] CRAN (R 3.6.2) #> loo 2.2.0 2019-12-19 [1] CRAN (R 3.6.0) #> lubridate 1.7.8 2020-04-06 [1] CRAN (R 3.6.2) #> magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.0) #> markdown 1.1 2019-08-07 [1] CRAN (R 3.6.0) #> MASS 7.3-51.6 2020-04-26 [1] CRAN (R 3.6.2) #> Matrix 1.2-18 2019-11-27 [1] CRAN (R 3.6.0) #> matrixStats 0.56.0 2020-03-13 [1] CRAN (R 3.6.0) #> memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.0) #> mime 0.9 2020-02-04 [1] CRAN (R 3.6.2) #> miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 3.6.0) #> minqa 1.2.4 2014-10-09 [1] CRAN (R 3.6.0) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 3.6.0) #> nlme 3.1-147 2020-04-13 [1] CRAN (R 3.6.2) #> nloptr 1.2.2.1 2020-03-11 [1] CRAN (R 3.6.0) #> nnet 7.3-14 2020-04-26 [1] CRAN (R 3.6.2) #> parsnip * 0.1.1.9000 2020-05-19 [1] local #> pillar 1.4.4 2020-05-05 [1] CRAN (R 3.6.2) #> pkgbuild 1.0.8 2020-05-07 [1] CRAN (R 3.6.2) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.6.0) #> pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.0) #> plyr 1.8.6 2020-03-03 [1] CRAN (R 3.6.2) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 3.6.0) #> pROC 1.16.2 2020-03-19 [1] CRAN (R 3.6.0) #> processx 3.4.2 2020-02-09 [1] CRAN (R 3.6.2) #> prodlim 2019.11.13 2019-11-17 [1] CRAN (R 3.6.0) #> promises 1.1.0 2019-10-04 [1] CRAN (R 3.6.0) #> ps 1.3.3 2020-05-08 [1] CRAN (R 3.6.2) #> purrr * 0.3.4 2020-04-17 [1] CRAN (R 3.6.2) #> R6 2.4.1 2019-11-12 [1] CRAN (R 3.6.0) #> Rcpp 1.0.4.6 2020-04-09 [1] CRAN (R 3.6.2) #> recipes * 0.1.12 2020-05-01 [1] CRAN (R 3.6.2) #> remotes 2.1.1 2020-02-15 [1] CRAN (R 3.6.0) #> reshape2 1.4.4 2020-04-09 [1] CRAN (R 3.6.2) #> rlang 0.4.6 2020-05-02 [1] CRAN (R 3.6.2) #> rmarkdown 2.1 2020-01-20 [1] CRAN (R 3.6.0) #> rpart 4.1-15 2019-04-12 [1] CRAN (R 3.6.2) #> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.0) #> rsample * 0.0.6 2020-03-31 [1] CRAN (R 3.6.2) #> rsconnect 0.8.16 2019-12-13 [1] CRAN (R 3.6.2) #> rstan 2.19.3 2020-02-11 [1] CRAN (R 3.6.0) #> rstanarm 2.19.3 2020-02-11 [1] CRAN (R 3.6.2) #> rstantools 2.0.0 2019-09-15 [1] CRAN (R 3.6.0) #> rstudioapi 0.11 2020-02-07 [1] CRAN (R 3.6.0) #> scales * 1.1.1 2020-05-11 [1] CRAN (R 3.6.2) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.0) #> shiny 1.4.0.2 2020-03-13 [1] CRAN (R 3.6.2) #> shinyjs 1.1 2020-01-13 [1] CRAN (R 3.6.0) #> shinystan 2.5.0 2018-05-01 [1] CRAN (R 3.6.0) #> shinythemes 1.1.2 2018-11-06 [1] CRAN (R 3.6.0) #> SnowballC 0.7.0 2020-04-01 [1] CRAN (R 3.6.2) #> StanHeaders 2.21.0-1 2020-01-19 [1] CRAN (R 3.6.0) #> statmod 1.4.34 2020-02-17 [1] CRAN (R 3.6.0) #> stringi 1.4.6 2020-02-17 [1] CRAN (R 3.6.0) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.0) #> survival 3.1-12 2020-04-10 [1] CRAN (R 3.6.2) #> testthat 2.3.2 2020-03-02 [1] CRAN (R 3.6.2) #> threejs 0.3.3 2020-01-21 [1] CRAN (R 3.6.0) #> tibble * 3.0.1 2020-04-20 [1] CRAN (R 3.6.2) #> tidymodels * 0.1.0 2020-02-16 [1] CRAN (R 3.6.0) #> tidyposterior 0.0.2 2018-11-15 [1] CRAN (R 3.6.0) #> tidypredict 0.4.5 2020-02-10 [1] CRAN (R 3.6.0) #> tidyr 1.0.3 2020-05-07 [1] CRAN (R 3.6.2) #> tidyselect 1.1.0 2020-05-11 [1] CRAN (R 3.6.2) #> tidytext 0.2.4.900 2020-05-01 [1] local #> timeDate 3043.102 2018-02-21 [1] CRAN (R 3.6.0) #> tokenizers 0.2.1 2018-03-29 [1] CRAN (R 3.6.0) #> tune * 0.1.0 2020-04-02 [1] CRAN (R 3.6.2) #> usethis 1.6.1 2020-04-29 [1] CRAN (R 3.6.2) #> utf8 1.1.4 2018-05-24 [1] CRAN (R 3.6.0) #> vctrs 0.3.0.9000 2020-05-20 [1] Github (r-lib/vctrs@a23fdcf) #> withr 2.2.0 2020-04-20 [1] CRAN (R 3.6.2) #> workflows * 0.1.1 2020-03-17 [1] CRAN (R 3.6.0) #> xfun 0.13 2020-04-13 [1] CRAN (R 3.6.2) #> xtable 1.8-4 2019-04-21 [1] CRAN (R 3.6.0) #> xts 0.12-0 2020-01-19 [1] CRAN (R 3.6.0) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 3.6.0) #> yardstick * 0.0.6 2020-03-17 [1] CRAN (R 3.6.0) #> zoo 1.8-8 2020-05-02 [1] CRAN (R 3.6.2) #> #> [1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library ```

New dplyr

library(tidymodels)
#> ── Attaching packages ──────────────────────────────────────────────────────────────────────────── tidymodels 0.1.0 ──
#> ✓ broom     0.5.6           ✓ recipes   0.1.12     
#> ✓ dials     0.0.6           ✓ rsample   0.0.6      
#> ✓ dplyr     0.8.99.9003     ✓ tibble    3.0.1      
#> ✓ ggplot2   3.3.0           ✓ tune      0.1.0      
#> ✓ infer     0.5.1           ✓ workflows 0.1.1      
#> ✓ parsnip   0.1.1.9000      ✓ yardstick 0.0.6      
#> ✓ purrr     0.3.4
#> ── Conflicts ─────────────────────────────────────────────────────────────────────────────── tidymodels_conflicts() ──
#> x purrr::discard()  masks scales::discard()
#> x dplyr::filter()   masks stats::filter()
#> x dplyr::lag()      masks stats::lag()
#> x ggplot2::margin() masks dials::margin()
#> x recipes::step()   masks stats::step()

trees_param <- parameters(trees(), min_n(), sample_size())

basic_af <- tibble(id = rep("trees", 3), hello = 1:3)

## look good
trees_param %>% mutate(hello = 1)
#> # A tibble: 3 x 7
#>   name        id          source component component_id object      hello
#>   <chr>       <chr>       <chr>  <chr>     <chr>        <list>      <dbl>
#> 1 trees       trees       list   unknown   unknown      <nparam[+]>     1
#> 2 min_n       min_n       list   unknown   unknown      <nparam[+]>     1
#> 3 sample_size sample_size list   unknown   unknown      <nparam[?]>     1
trees_param %>% arrange(id)
#> Collection of 3 parameters for tuning
#> 
#>           id parameter type object class
#>        min_n          min_n    nparam[+]
#>  sample_size    sample_size    nparam[?]
#>        trees          trees    nparam[+]
#> 
#> Parameters needing finalization:
#>    # Observations Sampled ('sample_size')
#> 
#> See `?dials::finalize` or `?dials::update.parameters` for more information.
trees_param %>% select(id)
#> # A tibble: 3 x 1
#>   id         
#>   <chr>      
#> 1 trees      
#> 2 min_n      
#> 3 sample_size
anti_join(trees_param, basic_af, by = "id")
#> Collection of 2 parameters for tuning
#> 
#>           id parameter type object class
#>        min_n          min_n    nparam[+]
#>  sample_size    sample_size    nparam[?]
#> 
#> Parameters needing finalization:
#>    # Observations Sampled ('sample_size')
#> 
#> See `?dials::finalize` or `?dials::update.parameters` for more information.
right_join(trees_param, basic_af, by = "id")
#> # A tibble: 3 x 7
#>   name  id    source component component_id object      hello
#>   <chr> <chr> <chr>  <chr>     <chr>        <list>      <int>
#> 1 trees trees list   unknown   unknown      <nparam[+]>     1
#> 2 trees trees list   unknown   unknown      <nparam[+]>     2
#> 3 trees trees list   unknown   unknown      <nparam[+]>     3
full_join(trees_param, basic_af, by = "id")
#> # A tibble: 5 x 7
#>   name        id          source component component_id object      hello
#>   <chr>       <chr>       <chr>  <chr>     <chr>        <list>      <int>
#> 1 trees       trees       list   unknown   unknown      <nparam[+]>     1
#> 2 trees       trees       list   unknown   unknown      <nparam[+]>     2
#> 3 trees       trees       list   unknown   unknown      <nparam[+]>     3
#> 4 min_n       min_n       list   unknown   unknown      <nparam[+]>    NA
#> 5 sample_size sample_size list   unknown   unknown      <nparam[?]>    NA
left_join(trees_param, basic_af, by = "id")
#> # A tibble: 5 x 7
#>   name        id          source component component_id object      hello
#>   <chr>       <chr>       <chr>  <chr>     <chr>        <list>      <int>
#> 1 trees       trees       list   unknown   unknown      <nparam[+]>     1
#> 2 trees       trees       list   unknown   unknown      <nparam[+]>     2
#> 3 trees       trees       list   unknown   unknown      <nparam[+]>     3
#> 4 min_n       min_n       list   unknown   unknown      <nparam[+]>    NA
#> 5 sample_size sample_size list   unknown   unknown      <nparam[?]>    NA

## this one also surprises and concerns me: STILL PARAMETERS
trees_param %>% mutate(id = paste0(id, "_better"),
                       name = paste0(name, "_best"))
#> Collection of 3 parameters for tuning
#> 
#>                  id   parameter type object class
#>        trees_better       trees_best    nparam[+]
#>        min_n_better       min_n_best    nparam[+]
#>  sample_size_better sample_size_best    nparam[?]
#> 
#> Parameters needing finalization:
#>    # Observations Sampled ('sample_size_better')
#> 
#> See `?dials::finalize` or `?dials::update.parameters` for more information.

Created on 2020-05-20 by the reprex package (v0.3.0)

Session info ``` r devtools::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 3.6.2 (2019-12-12) #> os macOS Mojave 10.14.6 #> system x86_64, darwin15.6.0 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz America/Denver #> date 2020-05-20 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.0) #> backports 1.1.7 2020-05-13 [1] CRAN (R 3.6.2) #> base64enc 0.1-3 2015-07-28 [1] CRAN (R 3.6.0) #> bayesplot 1.7.1 2019-12-01 [1] CRAN (R 3.6.0) #> boot 1.3-25 2020-04-26 [1] CRAN (R 3.6.2) #> broom * 0.5.6 2020-04-20 [1] CRAN (R 3.6.2) #> callr 3.4.3 2020-03-28 [1] CRAN (R 3.6.2) #> class 7.3-17 2020-04-26 [1] CRAN (R 3.6.2) #> cli 2.0.2 2020-02-28 [1] CRAN (R 3.6.0) #> codetools 0.2-16 2018-12-24 [1] CRAN (R 3.6.2) #> colorspace 1.4-1 2019-03-18 [1] CRAN (R 3.6.0) #> colourpicker 1.0 2017-09-27 [1] CRAN (R 3.6.0) #> crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.0) #> crosstalk 1.1.0.1 2020-03-13 [1] CRAN (R 3.6.2) #> desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.0) #> devtools 2.3.0 2020-04-10 [1] CRAN (R 3.6.2) #> dials * 0.0.6 2020-05-20 [1] Github (tidymodels/dials@b876ade) #> DiceDesign 1.8-1 2019-07-31 [1] CRAN (R 3.6.0) #> digest 0.6.25 2020-02-23 [1] CRAN (R 3.6.0) #> dplyr * 0.8.99.9003 2020-05-20 [1] Github (tidyverse/dplyr@688c534) #> DT 0.13 2020-03-23 [1] CRAN (R 3.6.2) #> dygraphs 1.1.1.6 2018-07-11 [1] CRAN (R 3.6.0) #> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 3.6.2) #> evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.0) #> fansi 0.4.1 2020-01-08 [1] CRAN (R 3.6.0) #> fastmap 1.0.1 2019-10-08 [1] CRAN (R 3.6.0) #> foreach 1.5.0 2020-03-30 [1] CRAN (R 3.6.2) #> fs 1.4.1 2020-04-04 [1] CRAN (R 3.6.2) #> furrr 0.1.0 2018-05-16 [1] CRAN (R 3.6.0) #> future 1.17.0 2020-04-18 [1] CRAN (R 3.6.2) #> generics 0.0.2 2018-11-29 [1] CRAN (R 3.6.0) #> ggplot2 * 3.3.0 2020-03-05 [1] CRAN (R 3.6.0) #> ggridges 0.5.2 2020-01-12 [1] CRAN (R 3.6.0) #> globals 0.12.5 2019-12-07 [1] CRAN (R 3.6.0) #> glue 1.4.1 2020-05-13 [1] CRAN (R 3.6.2) #> gower 0.2.1 2019-05-14 [1] CRAN (R 3.6.0) #> GPfit 1.0-8 2019-02-08 [1] CRAN (R 3.6.0) #> gridExtra 2.3 2017-09-09 [1] CRAN (R 3.6.0) #> gtable 0.3.0 2019-03-25 [1] CRAN (R 3.6.0) #> gtools 3.8.2 2020-03-31 [1] CRAN (R 3.6.2) #> highr 0.8 2019-03-20 [1] CRAN (R 3.6.0) #> htmltools 0.4.0 2019-10-04 [1] CRAN (R 3.6.0) #> htmlwidgets 1.5.1 2019-10-08 [1] CRAN (R 3.6.0) #> httpuv 1.5.2 2019-09-11 [1] CRAN (R 3.6.0) #> igraph 1.2.5 2020-03-19 [1] CRAN (R 3.6.0) #> infer * 0.5.1 2019-11-19 [1] CRAN (R 3.6.0) #> inline 0.3.15 2018-05-18 [1] CRAN (R 3.6.0) #> ipred 0.9-9 2019-04-28 [1] CRAN (R 3.6.0) #> iterators 1.0.12 2019-07-26 [1] CRAN (R 3.6.0) #> janeaustenr 0.1.5 2017-06-10 [1] CRAN (R 3.6.0) #> knitr 1.28 2020-02-06 [1] CRAN (R 3.6.0) #> later 1.0.0 2019-10-04 [1] CRAN (R 3.6.0) #> lattice 0.20-41 2020-04-02 [1] CRAN (R 3.6.2) #> lava 1.6.7 2020-03-05 [1] CRAN (R 3.6.2) #> lhs 1.0.2 2020-04-13 [1] CRAN (R 3.6.2) #> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 3.6.0) #> listenv 0.8.0 2019-12-05 [1] CRAN (R 3.6.0) #> lme4 1.1-23 2020-04-07 [1] CRAN (R 3.6.2) #> loo 2.2.0 2019-12-19 [1] CRAN (R 3.6.0) #> lubridate 1.7.8 2020-04-06 [1] CRAN (R 3.6.2) #> magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.0) #> markdown 1.1 2019-08-07 [1] CRAN (R 3.6.0) #> MASS 7.3-51.6 2020-04-26 [1] CRAN (R 3.6.2) #> Matrix 1.2-18 2019-11-27 [1] CRAN (R 3.6.0) #> matrixStats 0.56.0 2020-03-13 [1] CRAN (R 3.6.0) #> memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.0) #> mime 0.9 2020-02-04 [1] CRAN (R 3.6.2) #> miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 3.6.0) #> minqa 1.2.4 2014-10-09 [1] CRAN (R 3.6.0) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 3.6.0) #> nlme 3.1-147 2020-04-13 [1] CRAN (R 3.6.2) #> nloptr 1.2.2.1 2020-03-11 [1] CRAN (R 3.6.0) #> nnet 7.3-14 2020-04-26 [1] CRAN (R 3.6.2) #> parsnip * 0.1.1.9000 2020-05-19 [1] local #> pillar 1.4.4 2020-05-05 [1] CRAN (R 3.6.2) #> pkgbuild 1.0.8 2020-05-07 [1] CRAN (R 3.6.2) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.6.0) #> pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.0) #> plyr 1.8.6 2020-03-03 [1] CRAN (R 3.6.2) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 3.6.0) #> pROC 1.16.2 2020-03-19 [1] CRAN (R 3.6.0) #> processx 3.4.2 2020-02-09 [1] CRAN (R 3.6.2) #> prodlim 2019.11.13 2019-11-17 [1] CRAN (R 3.6.0) #> promises 1.1.0 2019-10-04 [1] CRAN (R 3.6.0) #> ps 1.3.3 2020-05-08 [1] CRAN (R 3.6.2) #> purrr * 0.3.4 2020-04-17 [1] CRAN (R 3.6.2) #> R6 2.4.1 2019-11-12 [1] CRAN (R 3.6.0) #> Rcpp 1.0.4.6 2020-04-09 [1] CRAN (R 3.6.2) #> recipes * 0.1.12 2020-05-01 [1] CRAN (R 3.6.2) #> remotes 2.1.1 2020-02-15 [1] CRAN (R 3.6.0) #> reshape2 1.4.4 2020-04-09 [1] CRAN (R 3.6.2) #> rlang 0.4.6 2020-05-02 [1] CRAN (R 3.6.2) #> rmarkdown 2.1 2020-01-20 [1] CRAN (R 3.6.0) #> rpart 4.1-15 2019-04-12 [1] CRAN (R 3.6.2) #> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.0) #> rsample * 0.0.6 2020-03-31 [1] CRAN (R 3.6.2) #> rsconnect 0.8.16 2019-12-13 [1] CRAN (R 3.6.2) #> rstan 2.19.3 2020-02-11 [1] CRAN (R 3.6.0) #> rstanarm 2.19.3 2020-02-11 [1] CRAN (R 3.6.2) #> rstantools 2.0.0 2019-09-15 [1] CRAN (R 3.6.0) #> rstudioapi 0.11 2020-02-07 [1] CRAN (R 3.6.0) #> scales * 1.1.1 2020-05-11 [1] CRAN (R 3.6.2) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.0) #> shiny 1.4.0.2 2020-03-13 [1] CRAN (R 3.6.2) #> shinyjs 1.1 2020-01-13 [1] CRAN (R 3.6.0) #> shinystan 2.5.0 2018-05-01 [1] CRAN (R 3.6.0) #> shinythemes 1.1.2 2018-11-06 [1] CRAN (R 3.6.0) #> SnowballC 0.7.0 2020-04-01 [1] CRAN (R 3.6.2) #> StanHeaders 2.21.0-1 2020-01-19 [1] CRAN (R 3.6.0) #> statmod 1.4.34 2020-02-17 [1] CRAN (R 3.6.0) #> stringi 1.4.6 2020-02-17 [1] CRAN (R 3.6.0) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.0) #> survival 3.1-12 2020-04-10 [1] CRAN (R 3.6.2) #> testthat 2.3.2 2020-03-02 [1] CRAN (R 3.6.2) #> threejs 0.3.3 2020-01-21 [1] CRAN (R 3.6.0) #> tibble * 3.0.1 2020-04-20 [1] CRAN (R 3.6.2) #> tidymodels * 0.1.0 2020-02-16 [1] CRAN (R 3.6.0) #> tidyposterior 0.0.2 2018-11-15 [1] CRAN (R 3.6.0) #> tidypredict 0.4.5 2020-02-10 [1] CRAN (R 3.6.0) #> tidyr 1.0.3 2020-05-07 [1] CRAN (R 3.6.2) #> tidyselect 1.1.0 2020-05-11 [1] CRAN (R 3.6.2) #> tidytext 0.2.4.900 2020-05-01 [1] local #> timeDate 3043.102 2018-02-21 [1] CRAN (R 3.6.0) #> tokenizers 0.2.1 2018-03-29 [1] CRAN (R 3.6.0) #> tune * 0.1.0 2020-04-02 [1] CRAN (R 3.6.2) #> usethis 1.6.1 2020-04-29 [1] CRAN (R 3.6.2) #> utf8 1.1.4 2018-05-24 [1] CRAN (R 3.6.0) #> vctrs 0.3.0.9000 2020-05-20 [1] Github (r-lib/vctrs@a23fdcf) #> withr 2.2.0 2020-04-20 [1] CRAN (R 3.6.2) #> workflows * 0.1.1 2020-03-17 [1] CRAN (R 3.6.0) #> xfun 0.13 2020-04-13 [1] CRAN (R 3.6.2) #> xtable 1.8-4 2019-04-21 [1] CRAN (R 3.6.0) #> xts 0.12-0 2020-01-19 [1] CRAN (R 3.6.0) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 3.6.0) #> yardstick * 0.0.6 2020-03-17 [1] CRAN (R 3.6.0) #> zoo 1.8-8 2020-05-02 [1] CRAN (R 3.6.2) #> #> [1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library ```
DavisVaughan commented 4 years ago

Unfortunately I think that one is going to have to be allowed. The internal check makes sure that the types of the columns are still as expected (i.e. character), and ensures that id isn't duplicated, but it doesn't check anything else about the data in the columns.

We can't check that the columns are exactly the same before and after a transformation, because we want things like df[1:2,] to work and still return a parameters df.

There is currently no way to say: "The data in a column was X before the transform, now after the transform it is Y. Now check that they are still compatible with each other".

The no duplicate check in id works because it doesn't rely on the data before the transformation in any way.

juliasilge commented 4 years ago

OK, makes sense, I am clearer now on what kind of changes in columns are being checked. ✅

github-actions[bot] commented 3 years ago

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.