antoinecarme / pyaf

PyAF is an Open Source Python library for Automatic Time Series Forecasting built on top of popular pydata modules.
BSD 3-Clause "New" or "Revised" License
457 stars 73 forks source link

Crash when training using example code - An attempt has been made to start a new process before the current process has finished its bootstrapping phase #156

Closed sachingooo closed 2 years ago

sachingooo commented 3 years ago

I have pyaf installed and am attempted to run the example code over at https://pypi.org/project/pyaf/

Here's the code:

import numpy as np
import pandas as pd
import pyaf.ForecastEngine as autof

N = 360
df_train = pd.DataFrame({"Date" : pd.date_range(start="2016-01-25", periods=N, freq='D'), "Signal" : (np.arange(N)//40 + np.arange(N) % 21 + np.random.randn(N))})
lEngine = autof.cForecastEngine()
lEngine.train(iInputDS = df_train, iTime = 'Date', iSignal = 'Signal', iHorizon = 7)

It fails on the last line with this error trace, multiple times (repeatedly, until I stop it):

Traceback (most recent call last):
self.mSignalDecomposition.train(iInputDS, iTime, iSignal, iHorizon, iExogenousData);INFO:pyaf.std:START_TRAINING 'Signal'
INFO:pyaf.std:START_TRAINING 'Signal'
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 322, in train
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 641, in train
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\ForecastEngine.py", line 25, in train

INFO:pyaf.std:START_TRAINING 'Signal'
    self.mSignalDecomposition.train(iInputDS, iTime, iSignal, iHorizon, iExogenousData);
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 641, in train
    lTrainer.train(iInputDS, iTime, iSignal, iHorizon)
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 322, in train
    self.train_multiprocessed(iInputDS, iTime, iSignal, iHorizon);
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 355, in train_multiprocessed
Traceback (most recent call last):
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\ForecastEngine.py", line 25, in train
    self.train_multiprocessed(iInputDS, iTime, iSignal, iHorizon);
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 355, in train_multiprocessed
    pool = Pool(self.mOptions.mNbCores)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\context.py", line 119, in Pool
Traceback (most recent call last):
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\ForecastEngine.py", line 25, in train
    pool = Pool(self.mOptions.mNbCores)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\context.py", line 119, in Pool
    context=self.get_context())
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\pool.py", line 176, in __init__
    self._repopulate_pool()
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\pool.py", line 241, in _repopulate_pool
    self.mSignalDecomposition.train(iInputDS, iTime, iSignal, iHorizon, iExogenousData);
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 641, in train
    lTrainer.train(iInputDS, iTime, iSignal, iHorizon)
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 322, in train
    self.mSignalDecomposition.train(iInputDS, iTime, iSignal, iHorizon, iExogenousData);
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 641, in train
    w.start()
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\process.py", line 112, in start
    context=self.get_context())
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\pool.py", line 176, in __init__
    self.train_multiprocessed(iInputDS, iTime, iSignal, iHorizon);
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 355, in train_multiprocessed
    lTrainer.train(iInputDS, iTime, iSignal, iHorizon)
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 322, in train
    self._popen = self._Popen(self)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\context.py", line 322, in _Popen
    self._repopulate_pool()
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\pool.py", line 241, in _repopulate_pool
    w.start()
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\process.py", line 112, in start
    self.train_multiprocessed(iInputDS, iTime, iSignal, iHorizon);
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 355, in train_multiprocessed
    return Popen(process_obj)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\popen_spawn_win32.py", line 46, in __init__
          File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 641, in train
    lTrainer.train(iInputDS, iTime, iSignal, iHorizon)
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 322, in train
Traceback (most recent call last):
prep_data = spawn.get_preparation_data(process_obj._name)    pool = Pool(self.mOptions.mNbCores)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\context.py", line 119, in Pool
    self.train_multiprocessed(iInputDS, iTime, iSignal, iHorizon);
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 355, in train_multiprocessed
    lTrainer.train(iInputDS, iTime, iSignal, iHorizon)
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 322, in train
    context=self.get_context())
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\ForecastEngine.py", line 25, in train

    pool = Pool(self.mOptions.mNbCores)    self.train_multiprocessed(iInputDS, iTime, iSignal, iHorizon);
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 355, in train_multiprocessed

  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\context.py", line 119, in Pool
    context=self.get_context())
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\pool.py", line 176, in __init__
          File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\pool.py", line 176, in __init__
pool = Pool(self.mOptions.mNbCores)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\context.py", line 119, in Pool
pool = Pool(self.mOptions.mNbCores)
self._repopulate_pool()        self._popen = self._Popen(self)  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\spawn.py", 
line 143, in get_preparation_data
    context=self.get_context())
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\pool.py", line 176, in __init__
    self._repopulate_pool()
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\pool.py", line 241, in _repopulate_pool

  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\context.py", line 322, in _Popen

  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\pool.py", line 241, in _repopulate_pool
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\context.py", line 119, in Pool
self._repopulate_pool()
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\pool.py", line 241, in _repopulate_pool
    _check_not_importing_main()            self.mSignalDecomposition.train(iInputDS, iTime, iSignal, iHorizon, iExogenousData);
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 641, in train

  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
context=self.get_context())
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\pool.py", line 176, in __init__
w.start()
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\process.py", line 112, in start
w.start()
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\process.py", line 112, in start
    lTrainer.train(iInputDS, iTime, iSignal, iHorizon)
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 322, in train
    self._popen = self._Popen(self)
w.start()    self._popen = self._Popen(self)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\context.py", line 322, in _Popen
    self.train_multiprocessed(iInputDS, iTime, iSignal, iHorizon);
  File "pathtomyfolder\.venv\lib\site-packages\pyaf\TS\SignalDecomposition.py", line 355, in train_multiprocessed
    return Popen(process_obj)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\popen_spawn_win32.py", line 46, in __init__
        is not going to be frozen to produce an executable.''')
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
antoinecarme commented 3 years ago

Hi @sachingooo

Thanks for using pyaf.

This seems to be an issue with python multiprocessing under windows. Could you please try the fix described here :

https://stackoverflow.com/questions/18204782/runtimeerror-on-windows-trying-python-multiprocessing

olivernash commented 3 years ago

Hello @antoinecarme @sachingooo - I'm having the same issue on MacOS

Roemer-de-Ruiter commented 2 years ago

@olivernash

For people running into this problem using mac, Python 3.8 on MacOS by default now uses spawn instead of fork as start method for new processes.

Try with:

multiprocessing.set_start_method("fork")
antoinecarme commented 2 years ago

@Roemer-de-Ruiter

Thanks a lot for the feedback. Python 3.9 still has some issues with multiprocessing on Windows and MacOS. these issues are related to spawn/fork methods.

Were you able to try this fix on MacOS (I am only a linux user) ? A copy-paste of the log of this script is welcome :

https://github.com/antoinecarme/pyaf/blob/master/tests/func/test_ozone.py

Roemer-de-Ruiter commented 2 years ago

@antoinecarme

I was able to fix it on macOS by setting the multiprocessing method to 'fork' as described above. As far as the log script of test_ozone.py I get the exact same results as the one in the question issued by @sachingooo.

antoinecarme commented 2 years ago

This issue is limited to Windows/MacOS and impacts all python3 multiprocessing users (external python bug).

Linux is OK => No significant impact on cloud users.

antoinecarme commented 2 years ago

Added a python bug with a minimal example of how to reproduce this issue.

https://github.com/python/cpython/issues/91573

antoinecarme commented 2 years ago

Following @arhadthedev recommendation, using if __name__ == '__main__' is mandatory for windows users, test scripts work without it for linux.

https://github.com/python/cpython/issues/91573#issuecomment-1100207553

I will adapt the demo script to reflect this in the README.md file on the main page. (confirmations are welcome).

antoinecarme commented 2 years ago

Demo script updated.

Fixed.