scikit-learn-contrib / skope-rules

machine learning with logical rules in Python
http://skope-rules.readthedocs.io
Other
599 stars 96 forks source link

remove n_jobs=1 default #20

Open AlJohri opened 5 years ago

AlJohri commented 5 years ago

Even when n_jobs is not passed, skope-rules still uses joblib as per the logs. This is because n_jobs defaults to 1 within the SkopeRules class:

https://github.com/scikit-learn-contrib/skope-rules/blob/e7f7b932587545b8a947256e3b3c087eea0e1a94/skrules/skope_rules.py#L152

https://github.com/scikit-learn-contrib/skope-rules/blob/e7f7b932587545b8a947256e3b3c087eea0e1a94/skrules/skope_rules.py#L169

https://github.com/scikit-learn-contrib/skope-rules/blob/e7f7b932587545b8a947256e3b3c087eea0e1a94/skrules/skope_rules.py#L280

This should default to None and not be passed into the BaggingClassifier and BaggingRegressor if None to prevent triggering joblib. Something like

extra_kwargs = {)
if self.n_jobs:
    extra_kwargs = {'n_jobs': self.n_jobs}
bagging_clf = BaggingClassifier(..., ..., **extra_kwargs)

If there's an easier way to do ^ please let me know.

This will prevent joblib from triggering at all in the case that n_jobs is None. Much easier to debug parallel processing issues like #18 when I can enable/disable joblib entirely.

Happy to submit a PR for this!

ngoix commented 5 years ago

I think we should just default n_jobs to None, though I'm not sure it will change anything. From the joblib doc:

            None is a marker for 'unset' that will be interpreted as n_jobs=1
            (sequential execution) unless the call is performed under a
            parallel_backend context manager that sets another value for
            n_jobs.
ngoix commented 5 years ago

you're welcome to open a PR for defaulting to None though

anilkumarpanda commented 4 years ago

Open a PR for this issue #40