blue-yonder / tsfresh

Automatic extraction of relevant features from time series:
http://tsfresh.readthedocs.io
MIT License
8.28k stars 1.21k forks source link

Function extract_features() does not stop when interrupting the kernel #818

Open mendel5 opened 3 years ago

mendel5 commented 3 years ago

The function extract_features() can be very computationally intensive when there are a lot of columns (features) in the rolled data frame. Sometimes I would like to make changes to the already running extract_features() function, e.g. set the parameter default_fc_parameters to a different setting.

I am used to pressing the square icon at the top which looks like the "Pause music" icon (next to "Run"). The tooltip says "interrupt the kernel". However when I press this button, the extract_features() method does not stop calculating.

To get it to stop I need to kill the process through the Task manager or shut down Jupyter Notebook completely which is annoying.

I'm not sure if others experience this problem as well or if I set a wrong setting somewhere.

nils-braun commented 3 years ago

So for me that works (I just tried with the robot execution failures dataset in our repo): when I interrupt the kernel, the process stops correctly. Which python version, OS and jupyter version do you use?

mendel5 commented 3 years ago

Operating system:

Windows 7, 64-bit

Python version:

Python 3.8.7 (tags/v3.8.7:6503f05, Dec 21 2020, 17:59:51) [MSC v.1928 64 bit (AMD64)]

Jupyter version:

$ jupyter notebook --version
6.1.5

I know Windows 7 has reached its End of Support. There seems to be a newer version of Jupyter Notebook (6.2.0) available at https://github.com/jupyter/notebook/releases.

The PC I'm working on is very old. I'm quite sure this issue has something to do with the PC. Next week I will try to switch to a newer Linux machine.

nils-braun commented 3 years ago

I might be wrong, but that is maybe a jupyter notebook issue on windows you are currently facing. That might also be the reason why it is working for me (I am using Ubuntu).

mendel5 commented 3 years ago

Today I was finally able to move from the Windows 7 machine to a Linux machine. Unfortunately this issue still occurs.

Here is my code:

extraction_settings = ComprehensiveFCParameters()

X = extract_features(df_rolled, 
                     column_id='id',
                     column_sort='date',
                     default_fc_parameters=extraction_settings,
                     impute_function=impute,
                     n_jobs=7)

The CPU of the Linux machine is an Intel Core i7-7700K with 4 cores and 8 threads. With the parameter n_jobs=7 I am using 7 of the 8 available logical cores. The machine has 64 GB of RAM with 4 x 16 GB DDR4 at 2400 MHz.

When I press the Interrupt kernel button I get a red error box below the code box in Jupyter Notebook. However the 7 processes keep running in the background. The only way to stop the processes is to shut down the specific notebook in the Running tab in Jupyter Notebook.

foooooooooooooooobar commented 3 years ago

I actually get this same on Mac 10.15.7, tried njobs=0 as well as for some reason it still doesn't stop. Using a jupyter notebook to run the code.

nils-braun commented 3 years ago

Interesting! Would someone be able to share the dimensions of your data (not the data itself)? Do you try to interrupt while the features are calculated (the progress bar is below 100%) or after that (during imputing)?

I am not completely clear of we can do something at tsfresh (as we are also just using standard python functionality), but at least I can try to reproduce the issue.

mendel5 commented 3 years ago

Would someone be able to share the dimensions of your data (not the data itself)?

Sure. Here is my code:

df_features
1346 rows × 122 columns
df_rolled = roll_time_series(df_features,
                             column_id='personID',
                             column_sort='date',
                             max_timeshift=20,
                             min_timeshift=5,
                             rolling_direction=1,
                             n_jobs=7)

df_rolled.reset_index(drop=True, inplace=True)
df_rolled.drop(['personID'], axis=1, inplace=True)
df_rolled
28041 rows × 122 columns
extraction_settings = ComprehensiveFCParameters()

X = extract_features(df_rolled, 
                     column_id='id',
                     column_sort='date',
                     default_fc_parameters=extraction_settings,
                     impute_function=impute,
                     n_jobs=7)

Do you try to interrupt while the features are calculated (the progress bar is below 100%) or after that (during imputing)?

When I have to interrupt the kernel I usually do it shortly after starting the calculation. The progress bar is below 100 percent. Here is an example where the issue occurred:

Feature Extraction:   0%|          | 0/35 [00:41<?, ?it/s]

KeyboardInterrupt: 

Process ForkPoolWorker-21:
Process ForkPoolWorker-20:
Process ForkPoolWorker-15:
Process ForkPoolWorker-17:
Traceback (most recent call last):
Process ForkPoolWorker-18:
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()

Depending on how many seconds after starting the calculation I try to interrupt it, there is usually always a different error message. The one above is just an example. I just did another test and interrupted the process after 00:32 seconds instead of the 00:41 shown above. The error message was different but the processes still continued to run in the background.

G1401010 commented 3 years ago

I also encounter the same problem, but not every time. Sometimes when this problem occurs, I restart jupyter, and then I can use the stop button to interrupt the process

MaxBenChrist commented 3 years ago

Not a very helpful comment, but I encounter the same issue all the time. I run a script and want to stop it, I use CTRL-C on the command line or the interrupt button in the notebook but the script does not stop immediately.

They happen on different OS, mac, linux etc. It not only happens with notebooks but also when I fit models in a python script on the command line. So, this issue could not be related to tsfresh.

I think this happens if the code is in a c extension and is busy (e.g. multiplying big numpy arrays), so python cannot react properly to the SIGTERM signal. I use pkill -9 python all the time because of that.

G1401010 commented 3 years ago

I think so too.

I happen to be working on a project these days, so I paid attention to it. I found that it seems that not only tsfresh can't stop, but also xgboost and others.

And this kind of problem usually occurs when I run the code for a long time and then interrupt. If it is interrupted soon after operation, it can often be stopped successfully.

kasuteru commented 2 years ago

Just to chime in, we are encountering the same problem, with Azure Databricks (which is just a fancy version of notebooks). Can't cancel, but also, the cell runs through but never actually "finishes" so the notebook gets stuck and doesn't execute the next cell. We found that setting ncores=0 seems to fix it, but are still not sure.