blue-yonder / tsfresh

Automatic extraction of relevant features from time series:
http://tsfresh.readthedocs.io
MIT License
8.45k stars 1.21k forks source link

Can Not Rolling With Binary Feature #1011

Closed b-y-f closed 1 year ago

b-y-f commented 1 year ago

The problem:

import pandas as pd
df = pd.DataFrame({
   "id": [1, 1, 1, 1, 2, 2],
   "time": [1, 2, 3, 4, 8, 9],
   "x": [1, 2, 3, 4, 10, 11],
   "y": [True, True, True, False, False, False],
})
from tsfresh.utilities.dataframe_functions import roll_time_series
df_rolled = roll_time_series(df, column_id="id", column_sort="time")
from tsfresh import extract_features
df_features = extract_features(df_rolled, column_id="id", column_sort="time")
df_rolled

Got error

Anything else we need to know?:

File /opt/homebrew/Caskroom/miniforge/base/envs/lake/lib/python3.10/site-packages/tsfresh/feature_extraction/feature_calculators.py:315, in symmetry_looking()
    313     x = np.asarray(x)
    314 mean_median_difference = np.abs(np.mean(x) - np.median(x))
--> 315 max_min_difference = np.max(x) - np.min(x)
    316 return [
    317     ("r_{}".format(r["r"]), mean_median_difference < (r["r"] * max_min_difference))
    318     for r in param
    319 ]

TypeError: numpy boolean subtract, the `-` operator, is not supported, use the bitwise_xor, the `^` operator, or the logical_xor function instead.

Environment:

nils-braun commented 1 year ago

Actually, it is not the rolling which is problematic but the feature extraction after that. The feature calculator inside of tsfresh do only work for numerical features. For binary features most of them are not defined well. What you can do is convert your boolean into -1 and 1 or 0 and 1 and then use the feature extraction.

b-y-f commented 1 year ago

@nils-braun Thanks, changed to 0,1