Add utility to get PDF info for proper titles on PDF entries

openzim / python-scraperlib

Collection of Python code to re-use across Python-based scrapers

GNU General Public License v3.0

17 stars 16 forks source link

Open benoit74 opened 2 weeks ago

benoit74 commented 2 weeks ago

Content of PDF documents is not indexed for suggestions, while on some ZIM it is the "core" of the ZIM.

Having a utility in scraperlib to extract PDF info and get the document title would probably help.