izderadicka / pdfparser

Python binding to libpoppler with focus on text extraction
97 stars 45 forks source link

Convert to setuptools, use pkg-config to find poppler, add meta #3

Closed oplahcinski closed 7 years ago

oplahcinski commented 7 years ago

Added some updates in the realm to try and make it easier to find when looking for ways to speed up pdfminer

izderadicka commented 7 years ago

Thanks a lot for PR. My use case is to use local libpoppler (built from source and copied to directory with module). Can you make use of pkg-config optional in setup script? If there is env.variable POPPLER_ROOT use it, othewise use information from pkg-config.

For system wide installation of libpoppler also cpp headers are needed for pdfparser to build- make install does not install them (in Debian i think they are in separate package libpoppler-cpp-dev). How did it worked in your case? It would probably worth to mention in README?

For metadata - python 2.7 is supported, 3.5 will work too.

oplahcinski commented 7 years ago

NP, you saved me so much work!

I was going to write this manually myself before i stumbled upon your repo.

Im using CentOS 7 for my tests. All i needed to do was yum install poppler-devel and that was enough to pip install the repo. I threw together my instructions in the readme if you want to review it.

Let me know if the setup.py changes are what you were expecting. I added the packaged data to the setup call so the libpoppler.so comes along for a ride to the install directory.

izderadicka commented 7 years ago

Thanks will look at it soon and try to merge it.