ocrmypdf / OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
http://ocrmypdf.readthedocs.io/
Mozilla Public License 2.0
13.78k stars 1k forks source link

[Request]: Please make rich logging library an optional dependency #1342

Open lucasgadams opened 3 months ago

lucasgadams commented 3 months ago

What were you trying to do?

The inclusion of the rich library can disrupt an applications normal logging. Most libraries that utilize rich make it an optional dependency and then if the package is installed will use it automatically (see httpx, structlog). For a production setting, most do not want any "rich" output, and while in ocrmypdf you can disable the progress bar from printing, you can't uninstall rich because it is a primary dependency and is imported into the package. Ideally we can make this an optional dependency so I can leave it out from our production installs. Thanks!

Where are you installing/running from?

PyPI (pip, poetry, pipx, etc.)

OCRmyPDF version

16.4.0

What operating system are you working on?

Linux

Operating system details and version

No response

Simple sanity checks

Relevant log output

No response

jbarlow83 commented 3 months ago

ocrmypdf is first and foremost a command line application. Many users don't know that it's implemented in Python and don't necessarily care. They install using $somepackagemanager install ocrmypdf.

Your two examples, httpx and structlog, are both Python libraries consumed mainly using import, with optional CLI components that I wasn't aware of till now. Their CLI use case is a minority of users; for ocrmypdf it's the vast majority. I checked httpie, another CLI first program, and found it also brings in rich by default.

I would accept a PR that would make ocrmypdf work if rich happens to not be installed, but I'm not quite prepared to support removing it as a default when installed with pip install ocrmypdf. Requiring something like pip install ocrmypdf[cli] would inconvenience the vast majority of users. The default configuration in the PR needs to remain the best UX for the majority of users.