HazyResearch / pdftotree

:evergreen_tree: A tool for converting PDF into hOCR with text, tables, and figures being recognized and preserved.
MIT License
434 stars 92 forks source link

sklearn is deprecated #123

Open alvitawa opened 1 year ago

alvitawa commented 1 year ago

Pip has started to refuse to install sklearn sometimes. I believe the dependecy should be updated to scikit-learn?

Step #0 - "Build": Collecting sklearn==0.0.post1
Step #0 - "Build":   Downloading sklearn-0.0.post1.tar.gz (3.6 kB)
Step #0 - "Build":   Preparing metadata (setup.py): started
Step #0 - "Build":   Preparing metadata (setup.py): finished with status 'error'
Step #0 - "Build":   error: subprocess-exited-with-error
Step #0 - "Build":   
Step #0 - "Build":   × python setup.py egg_info did not run successfully.
Step #0 - "Build":   │ exit code: 1
Step #0 - "Build":   ╰─> [18 lines of output]
Step #0 - "Build":       The 'sklearn' PyPI package is deprecated, use 'scikit-learn'
Step #0 - "Build":       rather than 'sklearn' for pip commands.
Step #0 - "Build":       
Step #0 - "Build":       Here is how to fix this error in the main use cases:
Step #0 - "Build":       - use 'pip install scikit-learn' rather than 'pip install sklearn'
Step #0 - "Build":       - replace 'sklearn' by 'scikit-learn' in your pip requirements files
Step #0 - "Build":         (requirements.txt, setup.py, setup.cfg, Pipfile, etc ...)
Step #0 - "Build":       - if the 'sklearn' package is used by one of your dependencies,
Step #0 - "Build":         it would be great if you take some time to track which package uses
Step #0 - "Build":         'sklearn' instead of 'scikit-learn' and report it to their issue tracker
Step #0 - "Build":       - as a last resort, set the environment variable
Step #0 - "Build":         SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True to avoid this error
Step #0 - "Build":       
Step #0 - "Build":       More information is available at
Step #0 - "Build":       https://github.com/scikit-learn/sklearn-pypi-package
Step #0 - "Build":       
Step #0 - "Build":       If the previous advice does not cover your use case, feel free to report it at
Step #0 - "Build":       https://github.com/scikit-learn/sklearn-pypi-package/issues/new
Step #0 - "Build":       [end of output]
Step #0 - "Build":   
Step #0 - "Build":   note: This error originates from a subprocess, and is likely not a problem with pip.
Step #0 - "Build": error: metadata-generation-failed
Step #0 - "Build": 
Step #0 - "Build": × Encountered error while generating package metadata.
Step #0 - "Build": ╰─> See above for output.
HodeiG commented 1 year ago

Hello @lukehsiao,

Firstly, thank you for your work on the pdftotree project – it's been incredibly helpful. I've noticed that there are some dependency issues in the current version, and it seems that there are several pull requests addressing these concerns.

I understand that maintaining a project can be time-consuming, and I truly appreciate the effort you've put into it so far. Would it be possible to consider merging the pending changes that fix the dependency issues? This would greatly benefit the user community and ensure the continued success of pdftotree.

Your contributions are invaluable, and we're grateful for your ongoing dedication to the project. Thank you for your time and consideration.

Best regards,

lukehsiao commented 1 year ago

@HodeiG I'm sorry, I'm afraid I no longer have write access to this repository, consequently, I have no way to help maintain it even if I wanted to.

If you're using this repository, I suspect this is a good candidate for forking, as I would not expect anyone to continue to maintain it in the current organization.