mailgun / talon

Apache License 2.0
1.27k stars 285 forks source link

scikit-learn: depend on joblib directly #207

Open tommilligan opened 4 years ago

tommilligan commented 4 years ago

Closes #206

scikit-learn has deprecated accessing joblib internals for a while now, and this is a hard error in the release of scikit-learn==0.23.

This PR naively imports joblib directly, rather than via sklearn.externals.

mailgun-ci commented 4 years ago

Can one of the admins verify this patch?

cristicbz commented 4 years ago

Hi @obukhov-sergey can we help somehow to get this merged? It's blocking depending on newer scikit versions if we depend on talon in the same project.

Thanks for the awesome library btw!

sblondon commented 4 years ago

scikit-learn released version 0.23.1 so talon raises an exception when a new virtualenv is created. So the problem is now important: it's not about removing a Warning but keep the usabillity of the library.

You can reproduce with these steps:

 python3 -m venv venv
./venv/bin/pip install talon
./venv/bin/python
Python 3.6.10 (default, May 14 2020, 09:43:31) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import talon.signature
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/stephane/src/talonerreur/venv/lib/python3.6/site-packages/talon/signature/__init__.py", line 28, in <module>
    from . learning import classifier
  File "/home/stephane/src/talonerreur/venv/lib/python3.6/site-packages/talon/signature/learning/classifier.py", line 11, in <module>
    from sklearn.externals import joblib
ImportError: cannot import name 'joblib'
sblondon commented 4 years ago

A workaround can be done by adding a dependancy on scikit-learn in the project which needs talon. With a virtualenv, it's doable with:

python3 -m venv venv
./venv/bin/pip install "scikit-learn<0.23"
./venv/bin/pip install talon

Demo:

./venv/bin/python   
Python 3.7.3 (default, Dec 20 2019, 18:57:59) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import talon.signature
/tmp/talonerr/venv/lib/python3.7/site-packages/sklearn/externals/joblib/__init__.py:15: FutureWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.
  warnings.warn(msg, category=FutureWarning)

It's only a warning, not an ImportError.

An equivalent to this workaround can be inserted in setup.py to fix the error. However, the patch provided in this PR is a better solution.

desertbadger commented 4 years ago

Hi @horkhe just wanted to bump this if you are able to help get it looked at.

alfredfrancis commented 4 years ago

+1

rynmccrmck commented 4 years ago

+1

baztian commented 2 years ago

@abhishekshingadiya or @entamburini do you have merge permissions for this project? Can you please merge it?