blue-yonder / tsfresh

Automatic extraction of relevant features from time series:
http://tsfresh.readthedocs.io
MIT License
8.22k stars 1.21k forks source link

'IndexError' Exception when using extract_features with an id with a single entry #936

Open matanost-netop opened 2 years ago

matanost-netop commented 2 years ago

Generally speaking, I'm trying to extract features using extract_features from a timeseries using tsfresh.utilities.dataframe_functions.make_forecasting_frame. I encountered an unexpected exception, which I was able to reproduce using a similar example from the tsfresh documentation at https://tsfresh.readthedocs.io/en/latest/text/forecasting.html#forecasting-label

I suspect that the problem occurs due to an id with a single entry, since it is solved when using min_timeshift=1 for roll_time_series.

import pandas as pd df = pd.DataFrame({ "id": [1, 1, 1, 1, 2, 2], "time": [1, 2, 3, 4, 8, 9], "x": [1, 2, 3, 4, 10, 11], "y": [5, 6, 7, 8, 12, 13], }) from tsfresh.utilities.dataframe_functions import roll_time_series df_rolled = roll_time_series(df, column_id="id", column_sort="time") from tsfresh import extract_features df_features = extract_features(df_rolled, column_id="id", column_sort="time")

The exception is:

    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/opt/homebrew/lib/python3.9/site-packages/tsfresh/feature_extraction/extraction.py", line 164, in extract_features
        result = _do_extraction(
      File "/opt/homebrew/lib/python3.9/site-packages/tsfresh/feature_extraction/extraction.py", line 294, in _do_extraction
        result = distributor.map_reduce(
      File "/opt/homebrew/lib/python3.9/site-packages/tsfresh/utilities/distribution.py", line 241, in map_reduce
        result = list(itertools.chain.from_iterable(result))
      File "/opt/homebrew/lib/python3.9/site-packages/tqdm/std.py", line 1195, in __iter__
        for obj in iterable:
      File "/opt/homebrew/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/pool.py", line 870, in next
        raise value
      File "/opt/homebrew/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/pool.py", line 125, in worker
        result = (True, func(*args, **kwds))
      File "/opt/homebrew/lib/python3.9/site-packages/tsfresh/utilities/distribution.py", line 43, in _function_with_partly_reduce
        results = list(itertools.chain.from_iterable(results))
      File "/opt/homebrew/lib/python3.9/site-packages/tsfresh/utilities/distribution.py", line 42, in <genexpr>
        results = (map_function(chunk, **kwargs) for chunk in chunk_list)
      File "/opt/homebrew/lib/python3.9/site-packages/tsfresh/feature_extraction/extraction.py", line 386, in _do_extraction_on_chunk
        return list(_f())
      File "/opt/homebrew/lib/python3.9/site-packages/tsfresh/feature_extraction/extraction.py", line 364, in _f
        result = func(x, param=parameter_list)
      File "/opt/homebrew/lib/python3.9/site-packages/tsfresh/feature_extraction/feature_calculators.py", line 2103, in friedrich_coefficients
        calculated[m][r] = _estimate_friedrich_coefficients(x, m, r)
      File "/opt/homebrew/lib/python3.9/site-packages/tsfresh/feature_extraction/feature_calculators.py", line 152, in _estimate_friedrich_coefficients
        df["quantiles"] = pd.qcut(df.signal, r)
      File "/opt/homebrew/lib/python3.9/site-packages/pandas/core/reshape/tile.py", line 376, in qcut
        bins = np.quantile(x_np, quantiles)
      File "<__array_function__ internals>", line 5, in quantile
      File "/opt/homebrew/lib/python3.9/site-packages/numpy/lib/function_base.py", line 3979, in quantile
        return _quantile_unchecked(
      File "/opt/homebrew/lib/python3.9/site-packages/numpy/lib/function_base.py", line 3986, in _quantile_unchecked
        r, k = _ureduce(a, func=_quantile_ureduce_func, q=q, axis=axis, out=out,
      File "/opt/homebrew/lib/python3.9/site-packages/numpy/lib/function_base.py", line 3564, in _ureduce
        r = func(a, **kwargs)
      File "/opt/homebrew/lib/python3.9/site-packages/numpy/lib/function_base.py", line 4109, in _quantile_ureduce_func
        x_below = take(ap, indices_below, axis=0)
      File "<__array_function__ internals>", line 5, in take
      File "/opt/homebrew/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 190, in take
        return _wrapfunc(a, 'take', indices, axis=axis, out=out, mode=mode)
      File "/opt/homebrew/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 57, in _wrapfunc
        return bound(*args, **kwds)
    IndexError: cannot do a non-empty take from an empty axes.

I'm running on a macOS Monterey 12.0.1, python3. I use pip for installation.

Python version: User Current Version:- 3.9.10 (main, Jan 15 2022, 11:40:53) [Clang 13.0.0 (clang-1300.0.29.3)]

Relevant libraries versions: tsfresh==0.19.0 numpy==1.21.5 pandas==1.4.1

lucas8338 commented 2 years ago

i have the same error, this error just happens in python 3.8+. in python 3.7 it runs with no error. i cant use py3.7 cause another project requeriment.

None: my O.S==win10

huangxianyang commented 2 years ago

i have the same error, in linux python 3.8