rsquaredacademy / olsrr

Tools for developing OLS regression models
https://olsrr.rsquaredacademy.com/
Other
102 stars 23 forks source link

Stepwise forward regression fails when model formula contains inline functions #8

Closed aravindhebbali closed 7 years ago

aravindhebbali commented 7 years ago

ols_step_forward() returns an error when the model formula contains inline functions or interaction variables.

> library(caret)
> data("Sacramento")
> lm_fit2 <- lm(price  ~ beds + baths + log(sqft), data = Sacramento)
> ols_step_forward(lm_fit2)
We are selecting variables based on p value...
Error in eval(predvars, data, env) : object 'sqft' not found
Called from: eval(predvars, data, env)

> lm_fit1 <- lm(log(price)  ~ . - city, data = Sacramento)
> ols_step_forward(lm_fit1)
We are selecting variables based on p value...
Error in eval(predvars, data, env) : object 'price' not found
Called from: eval(predvars, data, env)

# interaction variables
> lm_fit3 <- lm(mpg ~ disp + hp + wt + am * disp, data = mtcars)
> ols_step_forward(lm_fit3)
We are selecting variables based on p value...
1 variable(s) added....
1 variable(s) added...
1 variable(s) added...
1 variable(s) added...
Error in ols_mallows_cp(fr$model, model) : 
  model must be a subset of full model
Called from: ols_mallows_cp(fr$model, model)

Session Info

> sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_India.1252  LC_CTYPE=English_India.1252   
[3] LC_MONETARY=English_India.1252 LC_NUMERIC=C                  
[5] LC_TIME=English_India.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Rcpp_0.12.7     gridExtra_2.0.0 tidyr_0.6.0     tibble_1.2     
[5] purrr_0.2.2     dplyr_0.5.0     caret_6.0-76    ggplot2_2.2.1  
[9] lattice_0.20-35

loaded via a namespace (and not attached):
 [1] magrittr_1.5       splines_3.4.0      MASS_7.3-47       
 [4] munsell_0.4.3      colorspace_1.2-7   R6_2.2.1          
 [7] foreach_1.4.3      minqa_1.2.4        stringr_1.1.0     
[10] car_2.1-2          plyr_1.8.4         tools_3.4.0       
[13] parallel_3.4.0     nnet_7.3-12        pbkrtest_0.4-6    
[16] grid_3.4.0         gtable_0.2.0       nlme_3.1-125      
[19] mgcv_1.8-17        quantreg_5.19      DBI_0.5-1         
[22] MatrixModels_0.4-1 iterators_1.0.8    lme4_1.1-11       
[25] lazyeval_0.2.0     assertthat_0.2.0   Matrix_1.2-9      
[28] nloptr_1.0.4       reshape2_1.4.2     ModelMetrics_1.1.0
[31] codetools_0.2-15   stringi_1.1.2      compiler_3.4.0    
[34] scales_0.4.1       stats4_3.4.0       SparseM_1.7 
aravindhebbali commented 7 years ago

ols_step_forward() does not return an error when the model formula contains inline functions or interaction variables.

> library(olsrr)
> library(caret)
> data("Sacramento")
> lm_fit2 <- lm(price  ~ beds + baths + log(sqft), data = Sacramento)
> ols_step_forward(lm_fit2)
We are selecting variables based on p value...
1 variable(s) added....
1 variable(s) added...
No more variables satisfy the condition of penter: 0.3
Forward Selection Method                                                        

Candidate Terms:                                                                

1 . beds                                                                        
2 . baths                                                                       
3 . log(sqft)                                                                   

-------------------------------------------------------------------------------
                               Selection Summary                                
-------------------------------------------------------------------------------
        Variable                   Adj.                                            
Step     Entered     R-Square    R-Square     C(p)         AIC          RMSE       
-------------------------------------------------------------------------------
   1    log(sqft)       0.568       0.567    52.6943    23833.1040    86242.3553    
   2    beds            0.591       0.590     2.9559    23784.5900    83981.7543    
-------------------------------------------------------------------------------

# interaction variables
> lm_fit3 <- lm(mpg ~ disp + hp + wt + am * disp, data = mtcars)
> ols_step_forward(lm_fit3)
We are selecting variables based on p value...
1 variable(s) added....
1 variable(s) added...
1 variable(s) added...
1 variable(s) added...
1 variable(s) added...
Forward Selection Method                                                  

Candidate Terms:                                                          

1 . disp                                                                  
2 . hp                                                                    
3 . wt                                                                    
4 . am                                                                    
5 . disp:am                                                               

-------------------------------------------------------------------------
                            Selection Summary                             
-------------------------------------------------------------------------
        Variable                  Adj.                                       
Step    Entered     R-Square    R-Square     C(p)        AIC        RMSE     
-------------------------------------------------------------------------
   1    wt             0.753       0.745    15.7814    166.0294    3.0458    
   2    hp             0.827       0.815     4.6820    156.6523    2.5935    
   3    am             0.840       0.823     4.3607    156.1348    2.5375    
   4    disp:am        0.853       0.831     4.0081    155.3638    2.4747    
   5    disp           0.853       0.825     6.0000    157.3538    2.5213    
-------------------------------------------------------------------------