intive-DataScience / tbats

BATS and TBATS forecasting methods
MIT License
178 stars 19 forks source link

OS Error: Too Many Files Open #2

Closed ajosanchez closed 5 years ago

ajosanchez commented 5 years ago

I'm running the code on a Linux machine and during the multiprocessing pooling it appears that files are getting opened but not closed in time. So I'm getting an OS error that too many files are being opened.

OSError                                   Traceback (most recent call last)
<ipython-input-14-70618b9681ef> in <module>()
      4     estimator = TBATS(seasonal_periods=[7, 30.5])
      5     train = X.set_index('ds').y.resample('D').sum().loc['2016':].iloc[:-i]
----> 6     fitted_model = estimator.fit(train)
      7     y_forecasted = fitted_model.forecast(steps=1)
      8 

/anaconda/envs/py35/lib/python3.5/site-packages/tbats/abstract/Estimator.py in fit(self, y)
     95             return self.context.create_constant_model(y[0]).fit(y)
     96 
---> 97         best_model = self._do_fit(y)
     98 
     99         for warning in best_model.warnings:

/anaconda/envs/py35/lib/python3.5/site-packages/tbats/tbats/TBATS.py in _do_fit(self, y)
     69         """Checks various model combinations to find best one by AIC"""
     70         components_grid = self._prepare_non_seasonal_components_grid()
---> 71         non_seasonal_model = self._choose_model_from_possible_component_settings(y, components_grid=components_grid)
     72 
     73         harmonics_choosing_strategy = self.context.create_harmonics_choosing_strategy(n_jobs=self.n_jobs)

/anaconda/envs/py35/lib/python3.5/site-packages/tbats/abstract/Estimator.py in _choose_model_from_possible_component_settings(self, y, components_grid)
    140         self._y = y
    141         # note n_jobs = None means to use cpu_count()
--> 142         pool = multiprocessing.pool.Pool(processes=self.n_jobs)
    143         models = pool.map(self._case_fit, components_grid)
    144         self._y = None  # clean-up

/anaconda/envs/py35/lib/python3.5/multiprocessing/pool.py in __init__(self, processes, initializer, initargs, maxtasksperchild, context)
    172         self._processes = processes
    173         self._pool = []
--> 174         self._repopulate_pool()
    175 
    176         self._worker_handler = threading.Thread(

/anaconda/envs/py35/lib/python3.5/multiprocessing/pool.py in _repopulate_pool(self)
    237             w.name = w.name.replace('Process', 'PoolWorker')
    238             w.daemon = True
--> 239             w.start()
    240             util.debug('added worker')
    241 

/anaconda/envs/py35/lib/python3.5/multiprocessing/process.py in start(self)
    103                'daemonic processes are not allowed to have children'
    104         _cleanup()
--> 105         self._popen = self._Popen(self)
    106         self._sentinel = self._popen.sentinel
    107         # Avoid a refcycle if the target function holds an indirect

/anaconda/envs/py35/lib/python3.5/multiprocessing/context.py in _Popen(process_obj)
    265         def _Popen(process_obj):
    266             from .popen_fork import Popen
--> 267             return Popen(process_obj)
    268 
    269     class SpawnProcess(process.BaseProcess):

/anaconda/envs/py35/lib/python3.5/multiprocessing/popen_fork.py in __init__(self, process_obj)
     18         sys.stderr.flush()
     19         self.returncode = None
---> 20         self._launch(process_obj)
     21 
     22     def duplicate_for_child(self, fd):

/anaconda/envs/py35/lib/python3.5/multiprocessing/popen_fork.py in _launch(self, process_obj)
     64     def _launch(self, process_obj):
     65         code = 1
---> 66         parent_r, child_w = os.pipe()
     67         self.pid = os.fork()
     68         if self.pid == 0:

OSError: [Errno 24] Too many open files
cotterpl commented 5 years ago

Thank you for spotting this. I have merged your pull request and I am releasing a new version with your fix.