tlverse / hal9001

🤠 📿 The Highly Adaptive Lasso
https://tlverse.org/hal9001
GNU General Public License v3.0
49 stars 15 forks source link

Still have bases with interactions when setting max_degree = 1 #111

Open SeraphinaShi opened 7 months ago

SeraphinaShi commented 7 months ago

Where I met the issue:

I have the following: X: (W, A), a n*2 numeric matrix Y: binary outcome, numeric vector with length n.

Then I fit HAL with the codes: fit_obj <- fit_hal(X = X, Y = Y, family = "binomial", return_x_basis = TRUE, num_knots = hal9001:::num_knots_generator( max_degree = 1, smoothness_orders = 1 ))

I still have interactions. First few rows of the output:

summary(fit_obj) Summary of non-zero coefficients is based on lambda of 0.002024853

        coef                                                                term

-5.575820e+00 (Intercept) 1.710380e+00 [ I(A >= 0)(A - 0)^1 ] -3.430264e-01 [ I(W >= -0.804)(W - -0.804)^1 ] [ I(A >= 0.396)(A - 0.396)^1 ]

rachaelvp commented 6 months ago

Hi @SeraphinaShi thanks for filing the issue and sorry about the delay! Could you please send your session info and a reproducible example so I can replicate this issue on my laptop?

SeraphinaShi commented 6 months ago

Yeah of course!

Here are the reproducible codes:

set.seed(123) n = 500

U_Y <- runif(n, 0, 1) U_A <- rnorm(n, 0, 2)

W <- rnorm(n, 0, 1)

A <- 2 - 0.5*W + U_A A[A<=0] = 0 A[A>=5] = 5

X <- cbind(W, A) Y <- as.numeric(U_Y < plogis(-5 + W + 2.25A - 0.5 W * A ))

fit_obj <- fit_hal(X = X, Y = Y, family = "binomial", return_x_basis = TRUE, num_knots = hal9001:::num_knots_generator( max_degree = 1, smoothness_orders = 1 ))

summary(fit_obj)

Here is my session info:

sessionInfo() R version 4.3.1 (2023-06-16) Platform: aarch64-apple-darwin20 (64-bit) Running under: macOS Sonoma 14.1.1

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Los_Angeles tzcode source: internal

attached base packages: [1] grid stats graphics grDevices utils datasets methods base

other attached packages: [1] mvtnorm_1.2-3 gridExtra_2.3 cowplot_1.1.1 ggplot2_3.4.3 pROC_1.18.4 R.utils_2.12.2
[7] R.oo_1.25.0 R.methodsS3_1.8.2 tictoc_1.2 hal9001_0.4.3 Rcpp_1.0.11 origami_1.0.7
[13] glmnet_4.1-8 Matrix_1.6-1.1 stringr_1.5.0 foreach_1.5.2 tidyr_1.3.0 dplyr_1.1.4
[19] data.table_1.14.8 here_1.0.1

loaded via a namespace (and not attached): [1] utf8_1.2.3 future_1.33.0 generics_0.1.3 shape_1.4.6 stringi_1.7.12
[6] lattice_0.21-8 listenv_0.9.0 digest_0.6.33 magrittr_2.0.3 evaluate_0.21
[11] iterators_1.0.14 fastmap_1.1.1 plyr_1.8.8 rprojroot_2.0.3 survival_3.5-7
[16] purrr_1.0.2 fansi_1.0.4 scales_1.2.1 codetools_0.2-19 abind_1.4-5
[21] cli_3.6.1 rlang_1.1.1 parallelly_1.36.0 future.apply_1.11.0 munsell_0.5.0
[26] splines_4.3.1 withr_2.5.1 yaml_2.3.7 tools_4.3.1 parallel_4.3.1
[31] colorspace_2.1-0 globals_0.16.2 assertthat_0.2.1 vctrs_0.6.4 R6_2.5.1
[36] lifecycle_1.0.3 pkgconfig_2.0.3 pillar_1.9.0 gtable_0.3.4 glue_1.6.2
[41] xfun_0.40 tibble_3.2.1 tidyselect_1.2.0 rstudioapi_0.15.0 knitr_1.44
[46] htmltools_0.5.6 rmarkdown_2.25 compiler_4.3.1