blue-yonder / tsfresh

Automatic extraction of relevant features from time series:
http://tsfresh.readthedocs.io
MIT License
8.45k stars 1.21k forks source link

Debugging tsfresh methods from Notebook freezes whole process #1023

Closed wojnarabc closed 1 year ago

wojnarabc commented 1 year ago

The problem:

Debugging code with roll_time_series() or extract_features() methods directly from Jupyter Notebook freezes debugging process when n_jobs parameter is set to anything apart from 1

Recreation steps:

  1. Define following method in .py file of your choice:
    def debug_tsfresh(df):
    df_rolled = roll_time_series(df, column_id="id", column_sort="time")
    return df_rolled
  2. Create dummy dataframe in a Notebook:
    df = pd.DataFrame({
    "id": [1, 1, 1, 1, 2, 2],
    "time": [1, 2, 3, 4, 8, 9],
    "x": [1, 2, 3, 4, 10, 11],
    "y": [5, 6, 7, 8, 12, 13],
    })
  3. Import debug_tsfresh() method and execute debug_tsfresh(df) from the notebook

Setting n_jobs=1 for both tsfresh methods solves the problem but would be great to have it working with more threads.

Anything else we need to know?:

While executing the function from other .py file all seems to be working well while having n_jobs set as default.

Environment:

shreyash-Pandey-Katni commented 1 year ago

Hi @wojnarabc, will you please share the exact code? I tried reproducing it but it is working perfectly fine for me.

wojnarabc commented 1 year ago

Hello @shreyash-Pandey-Katni , still getting stuck in debugger mode (visual studio code on my end). As for the scripts: Content of .py file:


from tsfresh.utilities.dataframe_functions import impute, roll_time_series

def debug_tsfresh(df):
    df_rolled = roll_time_series(df, column_id="id", column_sort="time", n_jobs=2)
    return df_rolled

Content of jupyter lab .ipynb file:

import debug_tsfresh

df = pd.DataFrame({
   "id": [1, 1, 1, 1, 2, 2],
   "time": [1, 2, 3, 4, 8, 9],
   "x": [1, 2, 3, 4, 10, 11],
   "y": [5, 6, 7, 8, 12, 13],
})

debug_tsfresh(df)

Running cell with debug_tsfresh(df) in debugging mode freezes the process while just executing it works fine.

shreyash-Pandey-Katni commented 1 year ago

Hi @wojnarabc, I tried the same way but I am not getting any error. I feel others might be able to help you as I cannot reproduce even after following the exact steps.

wojnarabc commented 1 year ago

Thanks for checking @shreyash-Pandey-Katni, guess it's something on my end then. As it's not critical for me, let's close it. I'm fine with doing debugging with n_jobs set to 1 and executing with higher value of this parameter.