tidyverse / dplyr

dplyr: A grammar of data manipulation
https://dplyr.tidyverse.org/
Other
4.78k stars 2.12k forks source link

`mutate()` > `dplyr_col_select()` #6081

Closed romainfrancois closed 2 years ago

romainfrancois commented 3 years ago

With the change in https://github.com/tidyverse/dplyr/pull/6035, mutate.data.frame() always end up calling dplyr_col_select(out, cols_retain) when before it did not in the keep == "all" case.

I think it is better, however I believe this is what is breaking fable tools and then fable because:

fabletools:::`[.mdl_df`
#> function (x, i, j, drop = FALSE) 
#> {
#>     out <- as_tibble(NextMethod())
#>     cn <- colnames(out)
#>     kv <- intersect(key_vars(x), cn)
#>     mv <- intersect(mable_vars(x), cn)
#>     if (n_keys(x) != length(kv)) {
#>         return(out)
#>     }
#>     build_mable(out, key = !!kv, model = !!mv)
#> }
#> <bytecode: 0x7fe1da9332a8>
#> <environment: namespace:fabletools>

Created on 2021-11-15 by the reprex package (v2.0.1.9000)

Because of this, mutate.data.frame() might return a tibble now, but used to return a mdl_df before. A workaround might be to do: dplyr_reconstruct(dplyr_col_select(out, cols_retain), .data) but it otherwise feel like this should be fixed in fabletools:::[.mdl_df.

cc @DavisVaughan

romainfrancois commented 3 years ago

rev dev results for fable :

── After ─────────────────────────────────────────────────────────────────────────────────────────────────────
> checking examples ... ERROR
  Running examples in ‘fable-Ex.R’ failed
  The error most likely occurred in:

  > ### Name: refit.AR
  > ### Title: Refit an AR model
  > ### Aliases: refit.AR
  > 
  > ### ** Examples
  > 
  > lung_deaths_male <- as_tsibble(mdeaths)
  > lung_deaths_female <- as_tsibble(fdeaths)
  > 
  > fit <- lung_deaths_male %>%
  +   model(AR(value ~ 1 + order(10)))
  > 
  > report(fit)
  Series: value 
  Model: AR(10) w/ mean 

  Coefficients:
    constant     ar1      ar2     ar3      ar4      ar5    ar6      ar7      ar8
    572.0428  0.6739  -0.1584  0.1291  -0.1581  -0.0342  0.012  -0.1172  -0.0832
        ar9    ar10
    -0.0712  0.4145

  sigma^2 estimated as 37672
  AIC = -93.6   AICc = -89.2    BIC = -68.55> 
  > fit %>%
  +   refit(lung_deaths_female) %>%
  +   report()
  Error in UseMethod("report") : 
    no applicable method for 'report' applied to an object of class "c('tbl_df', 'tbl', 'data.frame')"
  Calls: %>% -> report
  Execution halted

> checking tests ... ERROR
  See below...

── Test failures ─────────────────────────────────────────────────────────────────────────────── testthat ────

> library(testthat)
> library(fable)
Loading required package: fabletools
> 
> test_check("fable")
══ Skipped tests ═══════════════════════════════════════════════════════════════
• On CRAN (1)

══ Failed tests ════════════════════════════════════════════════════════════════
── Failure (test-arima.R:66:3): Manual ARIMA selection ─────────────────────────
fitted(fable_fit) not equivalent to fitted(refit(fable_fit, USAccDeaths_tbl)).
current is not list-like
── Error (test-ets.R:51:3): Manual ETS selection ───────────────────────────────
Error in `UseMethod("tidy")`: no applicable method for 'tidy' applied to an object of class "c('tbl_df', 'tbl', 'data.frame')"
Backtrace:
    ▆
 1. ├─testthat::expect_identical(...) at test-ets.R:51:2
 2. │ └─testthat::quasi_label(enquo(object), label, arg = "object")
 3. │   └─rlang::eval_bare(expr, quo_get_env(quo))
 4. └─generics::tidy(refit(fable_fit, USAccDeaths_tbl))
── Error (test-lm.R:75:3): LM ──────────────────────────────────────────────────
Error in `UseMethod("tidy")`: no applicable method for 'tidy' applied to an object of class "c('tbl_df', 'tbl', 'data.frame')"
Backtrace:
    ▆
 1. ├─testthat::expect_identical(...) at test-lm.R:75:2
 2. │ └─testthat::quasi_label(enquo(expected), expected.label, arg = "expected")
 3. │   └─rlang::eval_bare(expr, quo_get_env(quo))
 4. └─generics::tidy(refit(fable_fit, USAccDeaths_tbl))

[ FAIL 3 | WARN 2 | SKIP 1 | PASS 90 ]
Error: Test failures
Execution halted

2 errors x | 0 warnings ✓ | 0 notes ✓
romainfrancois commented 3 years ago

rev dev results for fabletools:

── After ─────────────────────────────────────────────────────────────────────────────────────────────────────
> checking examples ... ERROR
  Running examples in ‘fabletools-Ex.R’ failed
  The error most likely occurred in:

  > ### Name: reconcile
  > ### Title: Forecast reconciliation
  > ### Aliases: reconcile reconcile.mdl_df
  > 
  > ### ** Examples
  > 
  > if (requireNamespace("fable", quietly = TRUE)) {
  + library(fable)
  + lung_deaths_agg <- as_tsibble(cbind(mdeaths, fdeaths)) %>%
  +   aggregate_key(key, value = sum(value))
  + 
  + lung_deaths_agg %>%
  +   model(lm = TSLM(value ~ trend() + season())) %>%
  +   reconcile(lm = min_trace(lm)) %>% 
  +   forecast()
  + }
  Error in UseMethod("forecast") : 
    no applicable method for 'forecast' applied to an object of class "c('tbl_df', 'tbl', 'data.frame')"
  Calls: %>% -> forecast
  Execution halted

> checking tests ... ERROR
  See below...

── Test failures ─────────────────────────────────────────────────────────────────────────────── testthat ────

> library(testthat)
> library(dplyr)

Attaching package: 'dplyr'

The following object is masked from 'package:testthat':

    matches

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

> 
> test_check("fabletools")
Loading required package: fabletools
══ Skipped tests ═══════════════════════════════════════════════════════════════
• On CRAN (1)

══ Failed tests ════════════════════════════════════════════════════════════════
── Error (test-combination.R:9:3): Combination modelling ───────────────────────
Error in `UseMethod("augment")`: no applicable method for 'augment' applied to an object of class "c('tbl_df', 'tbl', 'data.frame')"
Backtrace:
    ▆
 1. ├─testthat::expect_equal(...) at test-combination.R:9:2
 2. │ └─testthat::quasi_label(enquo(object), label, arg = "object")
 3. │   └─rlang::eval_bare(expr, quo_get_env(quo))
 4. ├─dplyr::select(augment(mbl_cmbn), -.model, -.innov)
 5. └─generics::augment(mbl_cmbn)
── Error (test-reconciliation.R:26:3): reconciliation ──────────────────────────
Error in `UseMethod("forecast")`: no applicable method for 'forecast' applied to an object of class "c('tbl_df', 'tbl', 'data.frame')"
Backtrace:
    ▆
 1. ├─fit_agg %>% reconcile(snaive = min_trace(snaive)) %>% ... at test-reconciliation.R:26:2
 2. └─fabletools::forecast(.)

[ FAIL 2 | WARN 0 | SKIP 1 | PASS 280 ]
Error: Test failures
Execution halted

2 errors x | 0 warnings ✓ | 0 notes ✓
DavisVaughan commented 3 years ago

@mitchelloharawild does this look like a bug in [.mbl_df to you? Here is a minimal reprex that is completely separate from dplyr:

library(fabletools)
library(fable)

lung_deaths_agg <- as_tsibble(cbind(mdeaths, fdeaths)) %>%
  aggregate_key(key, value = sum(value))

df_mable <- lung_deaths_agg %>%
  model(lm = TSLM(value ~ trend() + season()))

# A mable
df_mable
#> # A mable: 3 x 2
#> # Key:     key [3]
#>   key               lm
#>   <chr*>       <model>
#> 1 fdeaths       <TSLM>
#> 2 mdeaths       <TSLM>
#> 3 <aggregated>  <TSLM>

# Not a mable?
df_mable[c("key", "lm")]
#> # A tibble: 3 × 2
#>   key               lm
#>   <chr*>       <model>
#> 1 fdeaths       <TSLM>
#> 2 mdeaths       <TSLM>
#> 3 <aggregated>  <TSLM>

I think the if (n_keys(x) != length(kv)) { condition of [.mbl_df isn't quite right?

mitchelloharawild commented 2 years ago

Thanks @DavisVaughan & @romainfrancois - yes this is a bug on my end.

Assuming this is found from revdep checks in preparation of a dplyr release, when should I push an update of fabletools to CRAN?

romainfrancois commented 2 years ago

We plan to release in a few weeks. The sooner, the better I'd say.