kanzure / pdfparanoia

pdf watermark removal library for academic papers
https://pypi.python.org/pypi/pdfparanoia
533 stars 52 forks source link

Including a test PDF for cleaning + failed cleaning #35

Closed carlcrott closed 11 years ago

carlcrott commented 11 years ago

Just ran an attempt to clean a paper from " Analytical Chemistry " + failed with the following error:

thrive@thrive-laptop:~/python_projects/pdfparanoia$ pdfparanoia Ramoji2012.pdf -o Ramoji2012_clean.pdf 
Traceback (most recent call last):
  File "/usr/local/bin/pdfparanoia", line 5, in 
    pkg_resources.run_script('pdfparanoia==0.0.13', 'pdfparanoia')
  File "/usr/lib/python2.6/dist-packages/pkg_resources.py", line 461, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python2.6/dist-packages/pkg_resources.py", line 1194, in run_script
    execfile(script_filename, namespace, namespace)
  File "/usr/local/lib/python2.6/dist-packages/pdfparanoia-0.0.13-py2.6.egg/EGG-INFO/scripts/pdfparanoia", line 34, in 
    outputcontent = pdfparanoia.scrub(StringIO(Args.in_pdf.read()), verbose=verbose)
AttributeError: 'module' object has no attribute 'scrub'

Also per headline... it would be awesome to have a working test PDF.

kanzure commented 11 years ago

You are using an old version of pdfparanoia, can you try upgrading? I don't see the error on v0.0.15 at the moment.

pip install -U pdfparanoia
carlcrott commented 11 years ago

Looks like it ran through. Also I attempted to upgrade to the most recent version with

git pull origin master

no luck ... but the

pip install -U pdfparanoia

worked just fine... awaiting verification from client :D