tidymodels / parsnip

A tidy unified interface to models
https://parsnip.tidymodels.org
Other
586 stars 88 forks source link

one-class SVM with `kernlab` fails to produce predictions #974

Open aminadibi opened 1 year ago

aminadibi commented 1 year ago

The problem

I'm having trouble with using parsnip for one-class SVMs with kernlab engine using type="one-svc" option. First, it seems like I cannot get the fitted model to produce any predictions (see the reprex below). Would appreciate any help with that.

Second, unlike kernlab, it seems that the only way to fit the model with parsnip is to create a fake response column to act as the y in the formula, even though one-class novelty detection does not require a response variable. Is there any other way?

Thanks.

Reproducible example

library(tidymodels)
set.seed(200)
x1 <- rnorm(200)
x2 <- rnorm(200)+2 

df<-tibble(x1=x1, x2=x2)
df_test <- tibble(x1=x1+1, x2=x2+1)

df <- df %>% mutate(DUMMY_RESPONSE_DUMMY=as.factor(rep(9999,nrow(df))))

svm_rbf_spec <- svm_rbf() %>%
  set_mode("classification") %>%
  set_engine("kernlab", type="one-svc") 

svm_rbf_fit <- svm_rbf_spec %>%
  fit(DUMMY_RESPONSE_DUMMY~., data=df)

predict(svm_rbf_fit, new_data = df_test)
#> Error in res$values: $ operator is invalid for atomic vectors

Created on 2023-05-25 with reprex v2.0.2

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.3.0 (2023-04-21 ucrt) #> os Windows 11 x64 (build 22621) #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate English_Canada.utf8 #> ctype English_Canada.utf8 #> tz America/Vancouver #> date 2023-05-25 #> pandoc 2.19.2 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> backports 1.4.1 2021-12-13 [1] CRAN (R 4.3.0) #> broom * 1.0.4 2023-03-11 [1] CRAN (R 4.3.0) #> class 7.3-21 2023-01-23 [2] CRAN (R 4.3.0) #> cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0) #> codetools 0.2-19 2023-02-01 [2] CRAN (R 4.3.0) #> colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0) #> data.table 1.14.8 2023-02-17 [1] CRAN (R 4.3.0) #> dials * 1.2.0 2023-04-03 [1] CRAN (R 4.3.0) #> DiceDesign 1.9 2021-02-13 [1] CRAN (R 4.3.0) #> digest 0.6.31 2022-12-11 [1] CRAN (R 4.3.0) #> dplyr * 1.1.2 2023-04-20 [1] CRAN (R 4.3.0) #> evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0) #> fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0) #> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0) #> foreach 1.5.2 2022-02-02 [1] CRAN (R 4.3.0) #> fs 1.6.2 2023-04-25 [1] CRAN (R 4.3.0) #> furrr 0.3.1 2022-08-15 [1] CRAN (R 4.3.0) #> future 1.32.0 2023-03-07 [1] CRAN (R 4.3.0) #> future.apply 1.11.0 2023-05-21 [1] CRAN (R 4.3.0) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0) #> ggplot2 * 3.4.2 2023-04-03 [1] CRAN (R 4.3.0) #> globals 0.16.2 2022-11-21 [1] CRAN (R 4.3.0) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0) #> gower 1.0.1 2022-12-22 [1] CRAN (R 4.3.0) #> GPfit 1.0-8 2019-02-08 [1] CRAN (R 4.3.0) #> gtable 0.3.3 2023-03-21 [1] CRAN (R 4.3.0) #> hardhat 1.3.0 2023-03-30 [1] CRAN (R 4.3.0) #> htmltools 0.5.5 2023-03-23 [1] CRAN (R 4.3.0) #> infer * 1.0.4 2022-12-02 [1] CRAN (R 4.3.0) #> ipred 0.9-14 2023-03-09 [1] CRAN (R 4.3.0) #> iterators 1.0.14 2022-02-05 [1] CRAN (R 4.3.0) #> kernlab 0.9-32 2023-01-31 [1] CRAN (R 4.3.0) #> knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0) #> lattice 0.21-8 2023-04-05 [2] CRAN (R 4.3.0) #> lava 1.7.2.1 2023-02-27 [1] CRAN (R 4.3.0) #> lhs 1.1.6 2022-12-17 [1] CRAN (R 4.3.0) #> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0) #> listenv 0.9.0 2022-12-16 [1] CRAN (R 4.3.0) #> lubridate 1.9.2 2023-02-10 [1] CRAN (R 4.3.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) #> MASS 7.3-58.4 2023-03-07 [2] CRAN (R 4.3.0) #> Matrix 1.5-4 2023-04-04 [2] CRAN (R 4.3.0) #> modeldata * 1.1.0 2023-01-25 [1] CRAN (R 4.3.0) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0) #> nnet 7.3-18 2022-09-28 [2] CRAN (R 4.3.0) #> parallelly 1.35.0 2023-03-23 [1] CRAN (R 4.3.0) #> parsnip * 1.1.0 2023-04-12 [1] CRAN (R 4.3.0) #> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0) #> prodlim 2023.03.31 2023-04-02 [1] CRAN (R 4.3.0) #> purrr * 1.0.1 2023-01-10 [1] CRAN (R 4.3.0) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.3.0) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.3.0) #> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.3.0) #> R.utils 2.12.2 2022-11-11 [1] CRAN (R 4.3.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0) #> Rcpp 1.0.10 2023-01-22 [1] CRAN (R 4.3.0) #> recipes * 1.0.6 2023-04-25 [1] CRAN (R 4.3.0) #> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.3.0) #> rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0) #> rmarkdown 2.21 2023-03-26 [1] CRAN (R 4.3.0) #> rpart 4.1.19 2022-10-21 [2] CRAN (R 4.3.0) #> rsample * 1.1.1 2022-12-07 [1] CRAN (R 4.3.0) #> rstudioapi 0.14 2022-08-22 [1] CRAN (R 4.3.0) #> scales * 1.2.1 2022-08-20 [1] CRAN (R 4.3.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0) #> styler 1.10.0 2023-05-24 [1] CRAN (R 4.3.0) #> survival 3.5-5 2023-03-12 [2] CRAN (R 4.3.0) #> tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.0) #> tidymodels * 1.1.0 2023-05-01 [1] CRAN (R 4.3.0) #> tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.0) #> tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0) #> timechange 0.2.0 2023-01-11 [1] CRAN (R 4.3.0) #> timeDate 4022.108 2023-01-07 [1] CRAN (R 4.3.0) #> tune * 1.1.1 2023-04-11 [1] CRAN (R 4.3.0) #> utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0) #> vctrs 0.6.2 2023-04-19 [1] CRAN (R 4.3.0) #> withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0) #> workflows * 1.1.3 2023-02-22 [1] CRAN (R 4.3.0) #> workflowsets * 1.0.1 2023-04-06 [1] CRAN (R 4.3.0) #> xfun 0.39 2023-04-20 [1] CRAN (R 4.3.0) #> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0) #> yardstick * 1.2.0 2023-04-21 [1] CRAN (R 4.3.0) #> #> [1] C:/Users/amin/AppData/Local/R/win-library/4.3 #> [2] C:/Program Files/R/R-4.3.0/library #> #> ────────────────────────────────────────────────────────────────────────────── ```
EmilHvitfeldt commented 1 year ago

Hello @aminadibi 👋

Thanks for reporting this bug!

This bug happens in predict_class.model_fit() where https://github.com/tidymodels/parsnip/blob/145bac2d319db9debdd4c3be062c8b20f1f9c780/R/predict_class.R#L31 returns a 1 x n logical matrix.

This object doens't have an $values field so this line errors

https://github.com/tidymodels/parsnip/blob/145bac2d319db9debdd4c3be062c8b20f1f9c780/R/predict_class.R#L47

simonpcouch commented 10 months ago

Do you think this is in-scope for tidyclust, @EmilHvitfeldt?

EmilHvitfeldt commented 10 months ago

I don't know if I would but it under clustering. It feels much close in something like applicable. Like it is a type of anomaly detection, like https://github.com/tidymodels/applicable/issues/19 right?