uqfoundation / pathos

parallel graph management and execution in heterogeneous computing
http://pathos.rtfd.io
Other
1.38k stars 89 forks source link

Can't Import custom class [NameError] when using Process / Pool #188

Closed amithadiraju1694 closed 4 years ago

amithadiraju1694 commented 4 years ago

I'm trying to run a function on multiple processes using Process class from multi process. Inside my function, I'm importing a custom class that I created as part of my project. For obvious reasons, multi process's run function can't identify my custom class. I've checked multiple git issues related to this, but none of the suggested solutions worked for me. I have 6 cores on my machine, and would like to parallelise each instance of my function across 5 different cores.

My Environment:

`OS: Windows 10

multiprocess version: 0.70.9

pathos version: 0.2.5

dill version: 0.3.1.1

Python version: 3.7.1 `

My Code:

import numpy as np
import multiprocess as mp

from sklearn.model_selection import KFold

# Importing decision tree specific classes
import import_ipynb
from Decision_Trees import DTRegressor, DTClassifier
import pathos

DT_objects = mp.Queue()

def do(X_train, y_train, DT_objects):

    global splitter,number_features_split, min_samples_split, max_depth_tree

    regressor = DTRegressor(splitter = splitter, max_depth = max_depth_tree,
               min_samples_split= min_samples_split, max_features_split= number_features_split)

    regressor.fit(X_train, y_train)

    DT_objects.put(regressor)

# Setup a list of processes that we want to run
processes = [mp.Process(target=do, args=(splits[x][0], splits[x][1], DT_objects)) for x in range(5)]

# Run processes
for p in processes:
    p.start()

# Exit the completed processes
for p in processes:
    p.join()

my environment is on jupyter notebooks, "import_ipynb" is pypi package to import other jupyter notebooks into workspace, "DTRegressor" is the custom class that I'm importing inside my function. Variable 'Splits' is just a dictionary with tuple of arrays.

Stack Trace:

Process Process-1:
Traceback (most recent call last):
  File "D:\Anaconda3\lib\site-packages\multiprocess\process.py", line 297, in _bootstrap
    self.run()
  File "D:\Anaconda3\lib\site-packages\multiprocess\process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "<ipython-input-7-9a0796fc3535>", line 6, in do
NameError: name 'DTRegressor' is not defined
Process Process-2:
Traceback (most recent call last):
  File "D:\Anaconda3\lib\site-packages\multiprocess\process.py", line 297, in _bootstrap
    self.run()
  File "D:\Anaconda3\lib\site-packages\multiprocess\process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "<ipython-input-7-9a0796fc3535>", line 6, in do
NameError: name 'DTRegressor' is not defined
Process Process-3:
Traceback (most recent call last):
  File "D:\Anaconda3\lib\site-packages\multiprocess\process.py", line 297, in _bootstrap
    self.run()
  File "D:\Anaconda3\lib\site-packages\multiprocess\process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "<ipython-input-7-9a0796fc3535>", line 6, in do
NameError: name 'DTRegressor' is not defined
Process Process-4:
Traceback (most recent call last):
  File "D:\Anaconda3\lib\site-packages\multiprocess\process.py", line 297, in _bootstrap
    self.run()
  File "D:\Anaconda3\lib\site-packages\multiprocess\process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "<ipython-input-7-9a0796fc3535>", line 6, in do
NameError: name 'DTRegressor' is not defined
Process Process-5:
Traceback (most recent call last):
  File "D:\Anaconda3\lib\site-packages\multiprocess\process.py", line 297, in _bootstrap
    self.run()
  File "D:\Anaconda3\lib\site-packages\multiprocess\process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "<ipython-input-7-9a0796fc3535>", line 6, in do
NameError: name 'DTRegressor' is not defined

I also tried the Pool class, with some suggestions from @mmckerns , but they didn't work for me. Some one please let me know how do I fix this issue, or is it at all fixable. TIA ! :)

mmckerns commented 4 years ago

The suggestion I had made to you earlier was to use the master version of dill on GitHub, as there was a patch for this particular issue. Now, that patch is in a new release of dill, so you should just be able to update your dill with pip (or, equivalently, get the latest pathos release). Reopen if this doesn't solve it for you. This is a duplicate of #129