metachris / pdfx

Extract text, metadata and references (pdf, url, doi, arxiv) from PDF. Optionally download all referenced PDFs.
http://www.metachris.com/pdfx
Apache License 2.0
1.03k stars 113 forks source link

Unable to install pdfx #16

Closed fl0x2208 closed 3 years ago

fl0x2208 commented 8 years ago

I am trying to follow the instructions you have provided but it is not installing pdfx.

easy_install or pip gives error with regards to requirements. Can you please add a requiremients.txt to the code for pip install if required.

runing easy_install -U pdfx or setup.py gives couldn't find a setup script

metachris commented 8 years ago

Can you please post the output of the command easy_install -U pdfx

fl0x2208 commented 8 years ago

Thanks for the response Chris. Error shown below

root@ubuntu# easy_install -U pdfx Processing pdfx error: Couldn't find a setup script in /pathtodpfxfolder/pdfx

for some reason downloading the zip file and than doing pip didn't work. Also easy_install didn't work.

Thought I was doing something wrong. The steps I had to take to fix it.

  1. git clone https://github.com/metachris/pdfx.git
  2. cd pdfx
  3. pip install pdfx

and it worked like a charm. A very good work.

bcheeves commented 5 years ago

I'm having problems installing pdfx on CentOS as well. It tries to install a Mac OSX package for the dependency of pdfminer2:

[centos@CentOS-62-x-0 build]$ sudo easy_install -U pdfx Searching for pdfx Reading https://pypi.python.org/simple/pdfx/ Best match: pdfx 1.3.0 Downloading https://files.pythonhosted.org/packages/a5/17/607291a65fae00859ea87e23687fc2f190bc67817ef2ec14ff39e6bd1e05/pdfx-1.3.0.tar.gz#sha256=e3b296491879e4cf074fc42b50e9f86f6f8e1ab2628969520837ad348668d8b3 Processing pdfx-1.3.0.tar.gz Running pdfx-1.3.0/setup.py -q bdist_egg --dist-dir /tmp/easy_install-4dKJaJ/pdfx-1.3.0/egg-dist-tmp-e3VRik zip_safe flag not set; analyzing archive contents... Adding pdfx 1.3.0 to easy-install.pth file Installing pdfx script to /usr/bin

Installed /usr/lib/python2.6/site-packages/pdfx-1.3.0-py2.6.egg Processing dependencies for pdfx Searching for chardet Reading https://pypi.python.org/simple/chardet/ Best match: chardet 3.0.4 Downloading https://files.pythonhosted.org/packages/fc/bb/a5768c230f9ddb03acc9ef3f0d4a3cf93462473795d18e9535498c8f929d/chardet-3.0.4.tar.gz#sha256=84ab92ed1c4d4f16916e05906b6b75a6c0fb5db821cc65e70cbd64a3e2a5eaae Processing chardet-3.0.4.tar.gz Running chardet-3.0.4/setup.py -q bdist_egg --dist-dir /tmp/easy_install-4lryHL/chardet-3.0.4/egg-dist-tmp-rEvRE1 warning: no files found matching 'requirements.txt' zip_safe flag not set; analyzing archive contents... Adding chardet 3.0.4 to easy-install.pth file Installing chardetect script to /usr/bin

Installed /usr/lib/python2.6/site-packages/chardet-3.0.4-py2.6.egg Searching for pdfminer2 Reading https://pypi.python.org/simple/pdfminer2/ Best match: pdfminer2 20151206.macosx-10.10-x86-64 Downloading https://files.pythonhosted.org/packages/e0/55/5e235321d7494772264b577a8569c102b9d9ef867f7239d14d562e89bed9/pdfminer2-20151206.macosx-10.10-x86_64.tar.gz#sha256=9c0599bfde105a8d58e3f679c31ab84871dc9bd3debf1be6511e1abaa4db867f Processing pdfminer2-20151206.macosx-10.10-x86_64.tar.gz error: Couldn't find a setup script in /tmp/easy_install-bdGe2w/pdfminer2-20151206.macosx-10.10-x86_64.tar.gz [centos@CentOS-62-x-0 build]$ yum search pdfx Loaded plugins: fastestmirror, refresh-packagekit, security Loading mirror speeds from cached hostfile

[centos@CentOS-62-x-0 build]$ cat /etc/centos-release CentOS release 6.10 (Final)

Pointabc commented 5 years ago

Can u help me fix this error pdfx --help Traceback (most recent call last): File "/usr/local/bin/pdfx", line 5, in from pkg_resources import load_entry_point File "/usr/local/lib/python2.7/dist-packages/setuptools-0.6c11-py2.7.egg/pkg_resources.py", line 2603, in File "/usr/local/lib/python2.7/dist-packages/setuptools-0.6c11-py2.7.egg/pkg_resources.py", line 666, in require File "/usr/local/lib/python2.7/dist-packages/setuptools-0.6c11-py2.7.egg/pkg_resources.py", line 565, in resolve pkg_resources.DistributionNotFound: chardet

metachris commented 3 years ago

Can you try with v1.4.1 please? pip install -U pdfx. If the problem still exists, please reopen. Thanks