Open brtang63 opened 1 year ago
Thanks. I can reproduce this on my laptop. It may be caused by the extremely large value of the deviance when setting support.size = 0:14
.
> abess(x, y, tune.type = "gic", family = "poisson", support.size = 0:13)
Call:
abess.default(x = x, y = y, family = "poisson", tune.type = "gic", support.size = 0:13)
support.size dev GIC
1 0 -7.581848e+14 -1.51637e+15
2 1 -2.298525e+34 -4.59705e+34
3 2 -2.298525e+34 -4.59705e+34
4 3 -2.298525e+34 -4.59705e+34
5 4 -2.298525e+34 -4.59705e+34
6 5 -2.298525e+34 -4.59705e+34
7 6 -2.298525e+34 -4.59705e+34
8 7 -2.298525e+34 -4.59705e+34
9 8 -2.298525e+34 -4.59705e+34
10 9 -2.298525e+34 -4.59705e+34
11 10 -2.298525e+34 -4.59705e+34
12 11 -2.298525e+34 -4.59705e+34
13 12 -2.298525e+34 -4.59705e+34
14 13 -2.298525e+34 -4.59705e+34
@oooo26 , I have uploaded two files poisson_y.csv
and poisson_x.csv
that corresponds to y
and x
, respectively. Can you test whether this issue happens in python?
poisson_x.csv
poisson_y.csv
Hi, sorry for the late response. I have checked in Python, but the problem seems to not happen.
ABESS version: latest, v0.4.6(PyPI) Python version: 3.9.12
Here is the test code:
import numpy as np
import pandas as pd
import abess
X = pd.read_csv("poisson_x.csv")
y = pd.read_csv("poisson_y.csv").squeeze()
print(X.shape)
print(y.shape)
model = abess.PoissonRegression(
support_size=range(15), # 0:14
cv=5 # both CV and IC are working
)
model.fit(X, y)
print(f"Sparsity: {np.count_nonzero(model.coef_)}")
print(f"Non-zero: {np.nonzero(model.coef_)[0]}")
print(f"Train Loss: {model.train_loss_}")
print(f"Test Loss: {model.eval_loss_}")
######
# Sparsity: 4
# Non-zero: [122 352 573 769]
# Train Loss: -2360540438301305.5
# Test Loss: -729389503380903.0
######
@brtang63 , can you check this issue on the latest abess
R package? I believe this problem has been addressed.
Sorry for the late reply. I've tested with the latest CRAN version 0.4.8. I find this problem still happens occasionally. Note that the previous example I posted is not a good one, as seed is only set for generate.data()
, but not for sample()
. The following code is more reproducible. set.seed(1)
works fine, but set.seed(2)
still leads to this problem.
R version 4.3.1 abess version: 0.4.8
library(abess)
set.seed(2)
n <- 100
p <- 1000
family <- "poisson"
snr <- Inf
beta <- rep(0, p)
nonzero <- sample(1:p, 10)
beta[nonzero] <- c(5, 5, 5, 5, 5, 5, 5, 5, 5, 5)
k <- 10
data <- generate.data(n, p, beta = beta, snr = snr, family = family, support.size = k)
x <- data$x
y <- data$y
abess(x, y, tune.type = "cv", family = "poisson", support.size = 0:14)
@brtang63 I guess this is because the estimated coefficients are unbounded because of the natural of poisson distribution. In the new version of abess
library, you can use the beta.max
and beta.min
to control the range of estimated coefficients. You may refer this link: https://github.com/abess-team/abess/issues/510#issuecomment-1732315856
I've encountered a strange issue:
abess()
does not terminate in a specific situation. The following code produces a reproducible example. It runs for at least 10 mins without termination. However, by simply settingsupport.size = 0:13
orsupport.size = 14
, it terminates immediately (perhaps within 1 second). Moreover, whentune.type = "gic"
, this issue also didn't happen, which makes me really confused.The version of
abess
is0.4.7
(installed from CRAN). I've tested the code on two different Linux systems. The same issue is encountered.