Open tiffsea opened 4 years ago
Hi, it would be useful if you could provide the full traceback, rather than only the error at the end.
@tiffsea refextract uses pdftotext in the background. The error seems to be because refextract cannot find pdftotext installed in your system. Try installing it following the instructions for os dependencies here:
https://pypi.org/project/pdftotext/
and installing pdftotext:
pip install pdftotext
as well as:
conda install -c conda-forge poppler
The above solved the issue for me
@tiffsea To my limited knowledge, pip install pdftotext
installs some other package, which is different from what is needed here (correct me if i am wrong). pdftotext(1) version 3.00 is to be installed for refextract.
So, i installed XpdfReader instead (https://www.xpdfreader.com/pdftotext-man.html) using the commands:
wget http://security.ubuntu.com/ubuntu/pool/main/p/poppler/libpoppler73_0.62.0-2ubuntu2.12_amd64.deb
sudo apt-get install ./libpoppler73_0.62.0-2ubuntu2.12_amd64.deb
wget http://archive.ubuntu.com/ubuntu/pool/universe/x/xpdf/xpdf_3.04-7_amd64.deb
sudo apt-get install ./xpdf_3.04-7_amd64.deb
(ref: https://askubuntu.com/questions/1245518/how-to-install-xpdf-on-ubuntu-20-04)
The above solved the issue for me.
I get the following error when trying out the example code from the
refextract
docs. I will explain my system below.Error: TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType
Installation Used
I have used
pip install refextract
via terminal on MacOS Version 10.11.6 (15G22010). I have success with the installation although I did have to manually installlibmagic
usingbrew install libmagic
as I was getting an error inially.Usage Used
I tried first,
and got the following error:
Then, similar to the example code from the docs, I changed the code to,
which is the same error - TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType