tidymodels / parsnip

A tidy unified interface to models
https://parsnip.tidymodels.org
Other
590 stars 88 forks source link

Potential error on S3 dispatch #1090

Closed pgg1309 closed 6 months ago

pgg1309 commented 6 months ago

The problem

I'm having trouble with building a working parsnip model. The model fit works fine but for some reason S3 is not able to properly find the predict function.

See below a reproducible example. Any help is really appreciated.

Reproducible example

``` r
library(tidymodels)

# this package uses `hardhat` for building a model
# and then register the model for use with parsnip
devtools::install_github("pgg1309/fused.ridge")
#> Using GitHub PAT from the git credential store.
#> Skipping install of 'fused.ridge' from a github remote, the SHA1 (efb3dd72) has not changed since last install.
#>   Use `force = TRUE` to force installation
library(fused.ridge)
#> 
#> Attaching package: 'fused.ridge'
#> The following object is masked from 'package:stats':
#> 
#>     predict

# Load data
db <- mtcars

# fitting the model works as expected
my_fit <- fused_model(penalty = 2) |> 
  set_engine("fused_ridge") |>
  fit(mpg ~ ., data = db)
class(my_fit)
#> [1] "_fused_ridge" "model_fit"
class(my_fit$fit)
#> [1] "fused_ridge"    "hardhat_model"  "hardhat_scalar"

# however, there is an error when trying to predict
predict(my_fit, new_data = tail(db))
#> Error: `predict()` is not defined for a '_fused_ridge'.

# it works when using predict for 'model_fit'
predict.model_fit(my_fit, new_data = tail(db))
#> # A tibble: 6 × 1
#>   .pred
#>   <dbl>
#> 1  21.8
#> 2  25.6
#> 3  20.2
#> 4  19.9
#> 5  19.3
#> 6  27.1

# and it alsp works when using the predict method for the model
predict(my_fit$fit, new_data = tail(db))
#> # A tibble: 6 × 1
#>   .pred
#>   <dbl>
#> 1  21.8
#> 2  25.6
#> 3  20.2
#> 4  19.9
#> 5  19.3
#> 6  27.1

# when checking the S3 methods for the class of the model
# we see that there is a predict method for the class
sloop::s3_dispatch(predict(my_fit, new_data = tail(db)))
#>    predict._fused_ridge
#> => predict.model_fit
#>    predict.default

Created on 2024-03-25 with reprex v2.1.0

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.3.3 (2024-02-29 ucrt) #> os Windows 11 x64 (build 22631) #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate English_United States.utf8 #> ctype English_United States.utf8 #> tz America/Sao_Paulo #> date 2024-03-25 #> pandoc 3.1.1 @ C:/Users/pgrahl/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> backports 1.4.1 2021-12-13 [1] CRAN (R 4.3.1) #> bit 4.0.5 2022-11-15 [1] CRAN (R 4.3.3) #> bit64 4.0.5 2020-08-30 [1] CRAN (R 4.3.3) #> broom * 1.0.5 2023-06-09 [1] CRAN (R 4.3.3) #> cachem 1.0.8 2023-05-01 [1] CRAN (R 4.3.3) #> class 7.3-22 2023-05-03 [1] CRAN (R 4.3.3) #> cli 3.6.2 2023-12-11 [1] CRAN (R 4.3.3) #> codetools 0.2-19 2023-02-01 [1] CRAN (R 4.3.3) #> colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.3) #> crayon 1.5.2 2022-09-29 [1] CRAN (R 4.3.3) #> curl 5.2.1 2024-03-01 [1] CRAN (R 4.3.3) #> CVXR 1.0-12 2024-02-02 [1] CRAN (R 4.3.3) #> data.table 1.15.2 2024-02-29 [1] CRAN (R 4.3.3) #> devtools 2.4.5 2022-10-11 [1] CRAN (R 4.3.3) #> dials * 1.2.1 2024-02-22 [1] CRAN (R 4.3.3) #> DiceDesign 1.10 2023-12-07 [1] CRAN (R 4.3.3) #> digest 0.6.35 2024-03-11 [1] CRAN (R 4.3.3) #> dplyr * 1.1.4 2023-11-17 [1] CRAN (R 4.3.3) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.3.3) #> evaluate 0.23 2023-11-01 [1] CRAN (R 4.3.3) #> fansi 1.0.6 2023-12-08 [1] CRAN (R 4.3.3) #> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.3) #> foreach 1.5.2 2022-02-02 [1] CRAN (R 4.3.3) #> fs 1.6.3 2023-07-20 [1] CRAN (R 4.3.3) #> furrr 0.3.1 2022-08-15 [1] CRAN (R 4.3.3) #> fused.ridge * 0.0.0.9000 2024-03-25 [1] Github (pgg1309/fused.ridge@efb3dd7) #> future 1.33.1 2023-12-22 [1] CRAN (R 4.3.3) #> future.apply 1.11.1 2023-12-21 [1] CRAN (R 4.3.3) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.3) #> ggplot2 * 3.5.0 2024-02-23 [1] CRAN (R 4.3.3) #> globals 0.16.3 2024-03-08 [1] CRAN (R 4.3.3) #> glue 1.7.0 2024-01-09 [1] CRAN (R 4.3.3) #> gmp 0.7-4 2024-01-15 [1] CRAN (R 4.3.3) #> gower 1.0.1 2022-12-22 [1] CRAN (R 4.3.1) #> GPfit 1.0-8 2019-02-08 [1] CRAN (R 4.3.3) #> gtable 0.3.4 2023-08-21 [1] CRAN (R 4.3.3) #> hardhat 1.3.1 2024-02-02 [1] CRAN (R 4.3.3) #> htmltools 0.5.7 2023-11-03 [1] CRAN (R 4.3.3) #> htmlwidgets 1.6.4 2023-12-06 [1] CRAN (R 4.3.3) #> httpuv 1.6.14 2024-01-26 [1] CRAN (R 4.3.3) #> infer * 1.0.6 2024-01-31 [1] CRAN (R 4.3.3) #> ipred 0.9-14 2023-03-09 [1] CRAN (R 4.3.3) #> iterators 1.0.14 2022-02-05 [1] CRAN (R 4.3.3) #> knitr 1.45 2023-10-30 [1] CRAN (R 4.3.3) #> later 1.3.2 2023-12-06 [1] CRAN (R 4.3.3) #> lattice 0.22-5 2023-10-24 [1] CRAN (R 4.3.3) #> lava 1.8.0 2024-03-05 [1] CRAN (R 4.3.3) #> lhs 1.1.6 2022-12-17 [1] CRAN (R 4.3.3) #> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.3.3) #> listenv 0.9.1 2024-01-29 [1] CRAN (R 4.3.3) #> lubridate 1.9.3 2023-09-27 [1] CRAN (R 4.3.3) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.3) #> MASS 7.3-60.0.1 2024-01-13 [1] CRAN (R 4.3.3) #> Matrix 1.6-5 2024-01-11 [1] CRAN (R 4.3.3) #> memoise 2.0.1 2021-11-26 [1] CRAN (R 4.3.3) #> mime 0.12 2021-09-28 [1] CRAN (R 4.3.1) #> miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.3.3) #> modeldata * 1.3.0 2024-01-21 [1] CRAN (R 4.3.3) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.3) #> nnet 7.3-19 2023-05-03 [1] CRAN (R 4.3.3) #> osqp 0.6.3.2 2023-10-20 [1] CRAN (R 4.3.3) #> parallelly 1.37.1 2024-02-29 [1] CRAN (R 4.3.3) #> parsnip * 1.2.1 2024-03-22 [1] CRAN (R 4.3.3) #> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.3) #> pkgbuild 1.4.4 2024-03-17 [1] CRAN (R 4.3.3) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.3) #> pkgload 1.3.4 2024-01-16 [1] CRAN (R 4.3.3) #> prodlim 2023.08.28 2023-08-28 [1] CRAN (R 4.3.3) #> profvis 0.3.8 2023-05-02 [1] CRAN (R 4.3.3) #> promises 1.2.1 2023-08-10 [1] CRAN (R 4.3.3) #> purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.3) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.3.3) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.3.1) #> R.oo 1.26.0 2024-01-24 [1] CRAN (R 4.3.2) #> R.utils 2.12.3 2023-11-18 [1] CRAN (R 4.3.3) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.3) #> Rcpp 1.0.12 2024-01-09 [1] CRAN (R 4.3.3) #> recipes * 1.0.10 2024-02-18 [1] CRAN (R 4.3.3) #> remotes 2.5.0 2024-03-17 [1] CRAN (R 4.3.3) #> reprex 2.1.0 2024-01-11 [1] CRAN (R 4.3.3) #> rlang 1.1.3 2024-01-10 [1] CRAN (R 4.3.3) #> rmarkdown 2.26 2024-03-05 [1] CRAN (R 4.3.3) #> Rmpfr 0.9-5 2024-01-21 [1] CRAN (R 4.3.3) #> rpart 4.1.23 2023-12-05 [1] CRAN (R 4.3.3) #> rsample * 1.2.1 2024-03-25 [1] CRAN (R 4.3.3) #> rstudioapi 0.16.0 2024-03-24 [1] CRAN (R 4.3.3) #> scales * 1.3.0 2023-11-28 [1] CRAN (R 4.3.3) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.3) #> shiny 1.8.0 2023-11-17 [1] CRAN (R 4.3.3) #> sloop 1.0.1 2019-02-17 [1] CRAN (R 4.3.3) #> stringi 1.8.3 2023-12-11 [1] CRAN (R 4.3.2) #> stringr 1.5.1 2023-11-14 [1] CRAN (R 4.3.3) #> styler 1.10.2 2023-08-29 [1] CRAN (R 4.3.3) #> survival 3.5-8 2024-02-14 [1] CRAN (R 4.3.3) #> tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.3) #> tidymodels * 1.1.1 2023-08-24 [1] CRAN (R 4.3.3) #> tidyr * 1.3.1 2024-01-24 [1] CRAN (R 4.3.3) #> tidyselect 1.2.1 2024-03-11 [1] CRAN (R 4.3.3) #> timechange 0.3.0 2024-01-18 [1] CRAN (R 4.3.3) #> timeDate 4032.109 2023-12-14 [1] CRAN (R 4.3.2) #> tune * 1.2.0 2024-03-20 [1] CRAN (R 4.3.3) #> urlchecker 1.0.1 2021-11-30 [1] CRAN (R 4.3.3) #> usethis 2.2.3 2024-02-19 [1] CRAN (R 4.3.3) #> utf8 1.2.4 2023-10-22 [1] CRAN (R 4.3.3) #> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.3.3) #> withr 3.0.0 2024-01-16 [1] CRAN (R 4.3.3) #> workflows * 1.1.4 2024-02-19 [1] CRAN (R 4.3.3) #> workflowsets * 1.1.0 2024-03-21 [1] CRAN (R 4.3.3) #> xfun 0.43 2024-03-25 [1] CRAN (R 4.3.3) #> xtable 1.8-4 2019-04-21 [1] CRAN (R 4.3.3) #> yaml 2.3.8 2023-12-11 [1] CRAN (R 4.3.2) #> yardstick * 1.3.1 2024-03-21 [1] CRAN (R 4.3.3) #> #> [1] C:/Users/pgrahl/AppData/Local/Programs/R/R-4.3.3/library #> #> ────────────────────────────────────────────────────────────────────────────── ```
hfrick commented 6 months ago

Continuing the conversation from https://forum.posit.co/t/hardhat-vs-parsnip-to-create-new-models/183824

I had a look at your GitHub repo and I would suggest using the predict generic from stats, like parsnip does. I think that you defining your own generic here interferes with the S3 dispatch. I would export stats::predict and generally move parsnip in the DESCRIPTION from Imports to Depends so that functions like set_engine() etc are immediately available to users without you reexporting them.

pgg1309 commented 6 months ago

Thanks @hfrick . I did start without exporting predict but I was getting error so I moved to export the generic. I will try exporting stats::predict and moving parsnip as you suggested. I will also take a look at the structure of brulee as @topepo suggested.

I'm new to package building, trying to learn by doing...but I would appreciate if you could suggest some potential 'must-reads' to help me work with parsnip and hardhat in the tidymodels environment.

hfrick commented 6 months ago

brulee is a great example for making a modelling package with hardhat. Keep in mind though that the brulee package does not contain the parsnip model or engine, you will find those in parsnip itself. If you are looking for more examples of extending parsnip, you could take a look at the repos for bonsai, censored, multilevelmod, plsmod, poissonreg, and rules which are all parsnip extention packages. They contain various engines for models in parnsip, i.e., the model definition itself is not in the extension package as it is for fused.ridge. But otherwise they should be helpful examples for this endeavour.

Since you already found Max's presentation on hardhat and the article on building a parsnip model, I think it's likely that you also already found the R packages book but I'll mention it for completeness. An additional resource specifically for extending parsnip are the checklists in this draft PR. Feedback on those is very welcome so if you run into things that are unclear, please ask!

pgg1309 commented 6 months ago

Thanks. My main goal is to be able to use tidymodels workflow with my own modeling function. The ecosystem recipes, tune , workflowsets is very useful. After digging a bit, it seems that for the model I have in mind now (an glmne with a modified constraint) perhaps building just an engine would do the trick -- no neede to use all the hardhat infrastructure, so I will take a closer look at the repos for the packages you've mentioned. Appreciated the help. Thanks.

By the way, I added

' @export

' @importFrom stats predict

' @rdname predict

stats::predict

to my package as you suggested and it worked !!! Thanks

github-actions[bot] commented 5 months ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.