practical-data-science / ecommercetools

EcommerceTools is a Python data science toolkit for ecommerce, marketing science, and technical SEO analysis and modelling and was created by Matt Clarke.
MIT License
236 stars 48 forks source link

Scikit dependency deprecated in pip #38

Open romariosm opened 1 year ago

romariosm commented 1 year ago

There's an error on dependencies when it tries to install scikit package. This should be updated to scikit-learn.

` × Building wheel for sklearn (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [18 lines of output] The 'sklearn' PyPI package is deprecated, use 'scikit-learn' rather than 'sklearn' for pip commands.

  Here is how to fix this error in the main use cases:
  - use 'pip install scikit-learn' rather than 'pip install sklearn'
  - replace 'sklearn' by 'scikit-learn' in your pip requirements files
    (requirements.txt, setup.py, setup.cfg, Pipfile, etc ...)
  - if the 'sklearn' package is used by one of your dependencies,
    it would be great if you take some time to track which package uses
    'sklearn' instead of 'scikit-learn' and report it to their issue tracker
  - as a last resort, set the environment variable
    SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True to avoid this error

  More information is available at
  https://github.com/scikit-learn/sklearn-pypi-package

  If the previous advice does not cover your use case, feel free to report it at
  https://github.com/scikit-learn/sklearn-pypi-package/issues/new
  [end of output]`
cypris75 commented 4 months ago

I am using Jupyter lab in Anaconde and could get rid of this error by running:

export SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True

... before starting the jupyter notebook.

However I got this error then when using the seo module:

File [~/opt/anaconda3/envs/py10/lib/python3.10/site-packages/lxml/html/clean.py:18](http://localhost:8892/lab/workspaces/auto-v/tree/~/opt/anaconda3/envs/py10/lib/python3.10/site-packages/lxml/html/clean.py#line=17)
      8     __all__ = [
      9         "clean_html",
     10         "clean",
   (...)
     15         "word_break_html",
     16     ]
     17 except ImportError:
---> 18     raise ImportError(
     19         "lxml.html.clean module is now a separate project lxml_html_clean.\n"
     20         "Install lxml[html_clean] or lxml_html_clean directly."
     21     ) from None

ImportError: lxml.html.clean module is now a separate project lxml_html_clean.
Install lxml[html_clean] or lxml_html_clean directly.

Then I installed this package via:

!pip install lxml_html_clean

However it seems the package does not work (anymore) as expected.