ContinuumIO / anaconda-issues

Anaconda issue tracking
646 stars 220 forks source link

fix bug when anaconda does not permits to use joblib to run parallel calculating of windows 10 #815

Open mcg1969 opened 8 years ago

mcg1969 commented 8 years ago

From https://github.com/conda/conda/issues/2581 submitted by @Sandy4321:


may you pls help fix bug when anaconda does not permits

to use joblib to ran parallel calculating of windows per

https://blog.dominodatalab.com/simple-parallelization

01 from joblib import Parallel, delayed 02 import multiprocessing 03

04

what are your inputs, and what operation do you want to

05

perform on each input. For example...

06 inputs = range(10) 07 def processInput(i): 08 return i * i 09

10 num_cores = multiprocessing.cpu_count() 11

12 results = Parallel(n_jobs=num_cores)(delayed(processInput)(i) for i ininputs)

I have this error

numCores = 4
__parents_main__
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Anaconda\lib\multiprocessing\forking.py", line 380, in main
    prepare(preparation_data)
  File "C:\Anaconda\lib\multiprocessing\forking.py", line 509, in prepare
    '__parents_main__', file, path_name, etc
  File "c:\Sander\Python\S_May31_parallel.py", line 15, in <module>
    results = Parallel(n_jobs=num_cores)(delayed(processInput)(i) for i in inputs)  
  File "C:\Anaconda\lib\site-packages\joblib\parallel.py", line 766, in __call__
    n_jobs = self._initialize_pool()
  File "C:\Anaconda\lib\site-packages\joblib\parallel.py", line 515, in _initialize_pool
    raise ImportError('[joblib] Attempting to do parallel computing '
ImportError: [joblib] Attempting to do parallel computing without protecting your import on a system that does not support forking. To use parallel-computing in a script, you must protect your main loop using "if __name__ == '__main__'". Please see the joblib documentation on Parallel for more information
numCores = 4
__parents_main__

this link suggest fix

http://stackoverflow.com/questions/29545605/why-is-it-important-to-protect-the-main-loop-when-using-joblib-parallel

from joblib import Parallel, delayed
import multiprocessing

inputs = range(100)
def processInput(i):
return i * i

num_cores = multiprocessing.cpu_count()

print("numCores = " + str(num_cores))

print name

if name == 'main': results = Parallel(n_jobs=num_cores)(delayed(processInput)(i) for i in inputs)

print(results)

but then anaconda gives anohter error

numCores = 4
__parents_main__
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Anaconda\lib\multiprocessing\forking.py", line 380, in main
    prepare(preparation_data)
  File "C:\Anaconda\lib\multiprocessing\forking.py", line 509, in prepare
    '__parents_main__', file, path_name, etc
  File "c:\Sander\Python\S_May31_parallel.py", line 17, in <module>
    print(results)
NameError: name 'results' is not defined
jakubLangr commented 7 years ago

+1 I ran into the same problem with conda 4.2.12

jakubLangr commented 7 years ago

More specifically, I have a Docker image of jupyter/datascience-notebook and get this error when trying to parallelize things.