import numpy as np
from glum import GeneralizedLinearRegressor
rng = np.random.default_rng(0)
X = rng.normal(size=(100, 100))
X[:, 0] = 0
beta = np.zeros(100)
beta[:10] = 1
y = X @ beta + rng.normal(size=100)
model = GeneralizedLinearRegressor(family="normal", alpha=0.1, warm_start=True)
model.fit(X[:99, :], y[:99])
model.fit(X, y)
This raises
(glum) ~/code/icu-experiments [shared-lasso] $ python repr_example.py
/Users/mlondschien/mambaforge/envs/glum/lib/python3.11/site-packages/glum/_glm.py:407: RuntimeWarning: invalid value encountered in divide
coef[0] += np.squeeze(col_means / col_stds).dot(coef[1:])
Traceback (most recent call last):
File "/Users/mlondschien/code/icu-experiments/repr_example.py", line 13, in <module>
model.fit(X, y)
File "/Users/mlondschien/mambaforge/envs/glum/lib/python3.11/site-packages/glum/_glm.py", line 2518, in fit
coef = self._solve(
^^^^^^^^^^^^
File "/Users/mlondschien/mambaforge/envs/glum/lib/python3.11/site-packages/glum/_glm.py", line 1014, in _solve
coef, self.n_iter_, self._n_cycles, self.diagnostics_ = _irls_solver(
^^^^^^^^^^^^^
File "/Users/mlondschien/mambaforge/envs/glum/lib/python3.11/site-packages/glum/_solvers.py", line 315, in _irls_solver
d, n_cycles_this_iter = inner_solver(state, data, hessian)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mlondschien/mambaforge/envs/glum/lib/python3.11/site-packages/glum/_solvers.py", line 36, in inner_fct
out = fct(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^
File "/Users/mlondschien/mambaforge/envs/glum/lib/python3.11/site-packages/glum/_solvers.py", line 52, in _least_squares_solver
d = linalg.solve(hessian, state.score, assume_a="pos")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mlondschien/mambaforge/envs/glum/lib/python3.11/site-packages/scipy/linalg/_basic.py", line 144, in solve
b1 = atleast_1d(_asarray_validated(b, check_finite=check_finite))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mlondschien/mambaforge/envs/glum/lib/python3.11/site-packages/scipy/_lib/_util.py", line 252, in _asarray_validated
a = toarray(a)
^^^^^^^^^^
File "/Users/mlondschien/mambaforge/envs/glum/lib/python3.11/site-packages/numpy/lib/function_base.py", line 628, in asarray_chkfinite
raise ValueError(
ValueError: array must not contain infs or NaNs
Interesting. I think the solution is to use the function just above in the code you linked - _one_over_var_inf_to_val. That's how we've been dealing with this problem elsewhere in the codebase.
Minimal reproducible example:
This raises
The first warning hints at the problem.
A quick fix would be something along the lines of
as these columns don't contribute to
coef[0]
anyways. However, the question of how to account for floating point inaccuracies arises.