kanzure / pdfparanoia

pdf watermark removal library for academic papers
https://pypi.python.org/pypi/pdfparanoia
533 stars 52 forks source link

Why i meet a byte problem #52

Open jilinsunkun opened 5 years ago

jilinsunkun commented 5 years ago

what do I mean by string

C:\Users\dell\Desktop\水印删除>C:\Users\dell\Desktop\水印删除\pdf-watermark-removal.py
Traceback (most recent call last):
  File "C:\Users\dell\Desktop\水印删除\pdf-watermark-removal.py", line 3, in <module>
    pdf = pdfparanoia.scrub(open("天勤2019数据结构计算机考研复习指导电子版PDF.pdf", "rb"))
  File "D:\ProgramData\Anaconda3\lib\site-packages\pdfparanoia-0.0.16-py3.6.egg\pdfparanoia\core.py", line 53, in scrub
    content = plugin.scrub(content, verbose=verbose)
  File "D:\ProgramData\Anaconda3\lib\site-packages\pdfparanoia-0.0.16-py3.6.egg\pdfparanoia\plugins\aip.py", line 25, in scrub
    pdf = parse_content(content)
  File "D:\ProgramData\Anaconda3\lib\site-packages\pdfparanoia-0.0.16-py3.6.egg\pdfparanoia\parser.py", line 44, in parse_content
    stream = StringIO(content)
TypeError: initial_value must be str or None, not bytes``
carlcrott commented 5 years ago

Could you provide a link to the file you're parsing?

jilinsunkun commented 5 years ago

Could you provide a link to the file you're parsing?

The origin file is too big to upload,So I change a file , and i think it may not change...

Traceback (most recent call last): File "C:\Users\dell\Desktop\水印删除\pdf-watermark-removal.py", line 3, in pdf = pdfparanoia.scrub(open("document.pdf", "rb")) File "D:\ProgramData\Anaconda3\lib\site-packages\pdfparanoia-0.0.16-py3.6.egg\pdfparanoia\core.py", line 53, in scrub content = plugin.scrub(content, verbose=verbose) File "D:\ProgramData\Anaconda3\lib\site-packages\pdfparanoia-0.0.16-py3.6.egg\pdfparanoia\plugins\aip.py", line 25, in scrub pdf = parse_content(content) File "D:\ProgramData\Anaconda3\lib\site-packages\pdfparanoia-0.0.16-py3.6.egg\pdfparanoia\parser.py", line 44, in parse_content stream = StringIO(content) TypeError: initial_value must be str or None, not bytes]

document.pdf

udayhasan commented 5 years ago

I'm facing a similar issue. Can anyone help? Or, did any of you found any workaround?

fanshao commented 4 years ago

I'm facing a similar issue.

slarrain commented 3 years ago

Similar issue

sylph520 commented 3 years ago

similar here.

seanbenhur commented 3 years ago

Oh, similar here