Open manuel-masiello opened 3 years ago
I'm not familiar with how celery works, but joblib will do all the parallelisation under the hood, you just need to set n_job
when initialising the estimator. Is this something you would expect to work with, say, a random forest in scikit learn?
Hello and happy new year :-)
Thank you for this quick response. I just did a test with Random Forest with n_jobs = 10. It seems to work without problems:
@appCelery.task(name='capture.tasks.TaskRandomForestRegressor')
def TaskRandomForestRegressor(X_train, y_train):
est_rf = RandomForestRegressor(n_jobs=10)
est_rf.fit(X_train, y_train)
return encodeObjLearn(est_rf)
return:
[2021-01-04 08:47:42,206: INFO/MainProcess] Received task: capture.tasks.TaskRandomForestRegressor[011a5d09-6a51-45b4-9ef0-27f5277fe932]
[2021-01-04 08:47:42,386: INFO/ForkPoolWorker-1] Task capture.tasks.TaskRandomForestRegressor[011a5d09-6a51-45b4-9ef0-27f5277fe932] succeeded in 0.17795764410402626s:
I found an answer to this error but it requires a lib change and I'm not sure it works :
how can I apply multi_process on gplearn SymbolicTransformer?
It seems that gplearn support multi_thread by setting n_jobs=10.
Can we run it on multi process,which is even faster? How to do that ?
thx!
Hello, Thank you for your very good libs :-) In association with Celery Task Queue and MongoDB it is pure happiness!
Describe the bug
When I use the parameter n_jobs = 10, I get a warning message from joblib and the job is only done in one thread. I think it's related to using Celery but I can't figure out how to fix the problem.
Expected behavior
I would like to be able to parallelize the calculation on my 12-core processor.
Actual behavior
Steps to reproduce the behavior
System information
Linux-5.4.0-58-generic-x86_64-with-glibc2.29 Python 3.8.5 (default, Jul 28 2020, 12:59:40) [GCC 9.3.0] NumPy 1.19.2 SciPy 1.5.4 Scikit-Learn 0.24.0 Joblib 1.0.0 gplearn 0.4.1