Nixtla / mlforecast

Scalable machine 🤖 learning for time series forecasting.
https://nixtlaverse.nixtla.io/mlforecast
Apache License 2.0
789 stars 74 forks source link

Seg fault with numpy 2.0 - possibly just from Polars #354

Closed braaannigan closed 1 week ago

braaannigan commented 2 weeks ago

What happened + What you expected to happen

Calling .fit or .cross_validaation gives a seg fault with numpy 2.0. The MRE below works with pandas but fails with Polars (try removing pl.from_pandas)

Versions / Dependencies

polars==0.20.31 pandas==2.2.2 numba==0.60.0 numpy==2.0.0 (no issues when version is 1.26.4) mlforecast==0.13.0

Reproduction script

import polars as pl
from sklearn.linear_model import LinearRegression
from mlforecast import MLForecast
from mlforecast.target_transforms import Differences
from mlforecast.utils import generate_daily_series
# This works if we don't have pl.from_pandas
series = pl.from_pandas(generate_daily_series(
     n_series=20,
     max_length=100,
     n_static_features=1,
     static_as_categorical=False,
     with_trend=True
 ))
mf = MLForecast(
       models = [LinearRegression()],
       target_transforms=[Differences([1])],
       freq = "1h"
   )
mf.fit(series)
Fatal Python error: Segmentation fault

Thread 0x00007f483fa0a700 (most recent call first):
  File "/usr/local/lib/python3.10/site-packages/IPython/core/history.py", line 836 in _writeout_input_cache
  File "/usr/local/lib/python3.10/site-packages/IPython/core/history.py", line 853 in writeout_cache
  File "/usr/local/lib/python3.10/site-packages/IPython/core/history.py", line 61 in only_when_enabled
  File "/usr/local/lib/python3.10/site-packages/decorator.py", line 232 in fun
  File "/usr/local/lib/python3.10/site-packages/IPython/core/history.py", line 908 in run
  File "/usr/local/lib/python3.10/site-packages/IPython/core/history.py", line 61 in only_when_enabled
  File "/usr/local/lib/python3.10/site-packages/decorator.py", line 232 in fun
  File "/usr/local/lib/python3.10/threading.py", line 1009 in _bootstrap_inner
  File "/usr/local/lib/python3.10/threading.py", line 966 in _bootstrap

Current thread 0x00007f4841c22740 (most recent call first):
  File "/usr/local/lib/python3.10/site-packages/polars/series/series.py", line 4511 in to_numpy
  File "/usr/local/lib/python3.10/site-packages/utilsforecast/processing.py", line 676 in process_df
  File "/usr/local/lib/python3.10/site-packages/mlforecast/core.py", line 257 in _fit
  File "/usr/local/lib/python3.10/site-packages/mlforecast/core.py", line 487 in fit_transform
  File "/usr/local/lib/python3.10/site-packages/mlforecast/forecast.py", line 249 in preprocess
  File "/usr/local/lib/python3.10/site-packages/mlforecast/forecast.py", line 505 in fit
  File "<ipython-input-3-b80f6e17e2d9>", line 15 in <module>
  File "/usr/local/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3577 in run_code
  File "/usr/local/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3517 in run_ast_nodes
  File "/usr/local/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3334 in run_cell_async
  File "/usr/local/lib/python3.10/site-packages/IPython/core/async_helpers.py", line 129 in _pseudo_sync_runner
  File "/usr/local/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3130 in _run_cell
  File "/usr/local/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3075 in run_cell
  File "/usr/local/lib/python3.10/site-packages/IPython/terminal/interactiveshell.py", line 910 in interact
  File "/usr/local/lib/python3.10/site-packages/IPython/terminal/interactiveshell.py", line 917 in mainloop
  File "/usr/local/lib/python3.10/site-packages/IPython/terminal/ipapp.py", line 317 in start
  File "/usr/local/lib/python3.10/site-packages/traitlets/config/application.py", line 1075 in launch_instance
  File "/usr/local/lib/python3.10/site-packages/IPython/__init__.py", line 130 in start_ipython
  File "/usr/local/bin/ipython", line 8 in <module>

Extension modules: numpy._core._multiarray_umath, numpy._core._multiarray_tests, numpy.linalg._umath_linalg, scipy._lib._ccallback_c, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, charset_normalizer.md, scipy.linalg._solve_toeplitz, scipy.linalg._decomp_lu_cython, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.linalg._propack._spropack, scipy.sparse.linalg._propack._dpropack, scipy.sparse.linalg._propack._cpropack, scipy.sparse.linalg._propack._zpropack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, sklearn.__check_build._check_build, psutil._psutil_linux, psutil._psutil_posix, scipy.special._ufuncs_cxx, scipy.special._cdflib, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.spatial.transform._rotation, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy.optimize._trlib._trlib, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.optimize._direct, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.special.cython_special, scipy.stats._stats, scipy.stats.beta_ufunc, scipy.stats._boost.beta_ufunc, scipy.stats.binom_ufunc, scipy.stats._boost.binom_ufunc, scipy.stats.nbinom_ufunc, scipy.stats._boost.nbinom_ufunc, scipy.stats.hypergeom_ufunc, scipy.stats._boost.hypergeom_ufunc, scipy.stats.ncf_ufunc, scipy.stats._boost.ncf_ufunc, scipy.stats.ncx2_ufunc, scipy.stats._boost.ncx2_ufunc, scipy.stats.nct_ufunc, scipy.stats._boost.nct_ufunc, scipy.stats.skewnorm_ufunc, scipy.stats._boost.skewnorm_ufunc, scipy.stats.invgauss_ufunc, scipy.stats._boost.invgauss_ufunc, scipy.interpolate._fitpack, scipy.interpolate.dfitpack, scipy.interpolate._bspl, scipy.interpolate._ppoly, scipy.interpolate.interpnd, scipy.interpolate._rbfinterp_pythran, scipy.interpolate._rgi_cython, scipy.stats._biasedurn, scipy.stats._levy_stable.levyst, scipy.stats._stats_pythran, scipy._lib._uarray._uarray, scipy.stats._ansari_swilk_statistics, scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats._mvn, scipy.stats._rcont.rcont, scipy.stats._unuran.unuran_wrapper, sklearn.utils._isfinite, sklearn.utils.sparsefuncs_fast, sklearn.utils.murmurhash, sklearn.utils._openmp_helpers, sklearn.utils._random, sklearn.utils._seq_dataset, sklearn.metrics.cluster._expected_mutual_info_fast, sklearn.preprocessing._csr_polynomial_expansion, sklearn.preprocessing._target_encoder_fast, sklearn.metrics._dist_metrics, sklearn.metrics._pairwise_distances_reduction._datasets_pair, sklearn.utils._cython_blas, sklearn.metrics._pairwise_distances_reduction._base, sklearn.metrics._pairwise_distances_reduction._middle_term_computer, sklearn.utils._heap, sklearn.utils._sorting, sklearn.metrics._pairwise_distances_reduction._argkmin, sklearn.metrics._pairwise_distances_reduction._argkmin_classmode, sklearn.utils._vector_sentinel, sklearn.metrics._pairwise_distances_reduction._radius_neighbors, sklearn.metrics._pairwise_distances_reduction._radius_neighbors_classmode, sklearn.metrics._pairwise_fast, sklearn.linear_model._cd_fast, _loss, sklearn._loss._loss, sklearn.utils.arrayfuncs, sklearn.svm._liblinear, sklearn.svm._libsvm, sklearn.svm._libsvm_sparse, sklearn.utils._weight_vector, sklearn.linear_model._sgd_fast, sklearn.linear_model._sag_fast, pyarrow.lib, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.strptime, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.lib, pyarrow._compute, pandas._libs.ops, pandas._libs.hashing, pandas._libs.arrays, pandas._libs.tslib, pandas._libs.sparse, pandas._libs.internals, pandas._libs.indexing, pandas._libs.index, pandas._libs.writers, pandas._libs.join, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.groupby, pandas._libs.json, pandas._libs.parsers, pandas._libs.testing, yaml._yaml, numba.core.typeconv._typeconv, numba._helperlib, numba._dynfunc, numba._dispatcher, numba.core.runtime._nrt_python, numba.np.ufunc._internal, numba.experimental.jitclass._box (total: 202)
Segmentation fault

Issue Severity

High: It blocks me from completing my task.

braaannigan commented 2 weeks ago

The primary issue is with polars, I've reported it here: https://github.com/pola-rs/polars/issues/16998

Could mlforecast pin numpy<2.0 for the moment?

jmoralez commented 2 weeks ago

We currently have polars only as a dev requirement (to run the tests) https://github.com/Nixtla/mlforecast/blob/2ec60c1435189da542fbcb5480e77b11c2bf584f/settings.ini#L18-L25

In my opinion the pin should come from polars, since that's where the incompatibility is at the moment.

jmoralez commented 1 week ago

polars added the pin in their numpy extra, so this should be fixed by using: pip install "mlforecast[polars]" or pip install mlforecast "polars[numpy]". Feel free to reopen if you encounter this again.