koalaverse / homlr

Supplementary material for Hands-On Machine Learning with R, an applied book covering the fundamentals of machine learning with R.
https://koalaverse.github.io/homlr
Creative Commons Attribution Share Alike 4.0 International
229 stars 88 forks source link

Unnecessary pattern in 3.8.3 #70

Open guidothekp opened 1 year ago

guidothekp commented 1 year ago

In section 3.8.3 an example is shown on how to put the process together. The following is the snippet of code:

blueprint <- recipe(Sale_Price ~ ., data = ames_train) %>%
  step_nzv(all_nominal())  %>%
  step_integer(matches("Qual|Cond|QC|Qu")) %>%
  step_center(all_numeric(), -all_outcomes()) %>%
  step_scale(all_numeric(), -all_outcomes()) %>%
  step_pca(all_numeric(), -all_outcomes())

In the step, step_integer(matches("Qual|Cond|QC|Qu")) the pattern Qual is unnecessary as it is already covered by Qu. We can verify this:

> ames_train %>% select(matches("Qual")) %>% names
[1] "Overall_Qual"    "Exter_Qual"      "Bsmt_Qual"       "Low_Qual_Fin_SF"
[5] "Kitchen_Qual"    "Garage_Qual"    
> ames_train %>% select(matches("Qu")) %>% names
[1] "Overall_Qual"    "Exter_Qual"      "Bsmt_Qual"       "Low_Qual_Fin_SF"
[5] "Kitchen_Qual"    "Fireplace_Qu"    "Garage_Qual"    
> ames_train %>% select(matches("Qual|Qu")) %>% names
[1] "Overall_Qual"    "Exter_Qual"      "Bsmt_Qual"       "Low_Qual_Fin_SF"
[5] "Kitchen_Qual"    "Fireplace_Qu"    "Garage_Qual"  

The pattern Qu includes Qual.