SteveClement / ioc_parser

Tool to extract indicators of compromise from security reports in PDF format
Other
2 stars 2 forks source link

Handle "encrpyted" PDFs #1

Open SteveClement opened 6 years ago

SteveClement commented 6 years ago

Unlock protected PDF in Python

PDF Miner Encryption Error

consider:

from subprocess import call
call('qpdf --password=%s --decrypt %s %s' %('', pdf_filename, pdf_filename_decr), shell=True)

Error message

./report_to_misp.py -r isr2014.pdf
Parsing report(s) at isr2014.pdf...
Traceback (most recent call last):
  File "/home/steve/.virtualenvs/report_to_misp/local/lib/python3.6/site-packages/iocp/Parser.py", line 293, in parse
    self.parser_func(f, path)
  File "/home/steve/.virtualenvs/report_to_misp/local/lib/python3.6/site-packages/iocp/Parser.py", line 241, in parse_pdf
    self.parser_func(f, fpath)
  File "/home/steve/.virtualenvs/report_to_misp/local/lib/python3.6/site-packages/iocp/Parser.py", line 218, in parse_pdf_pdfminer
    for page in PDFPage.get_pages(f, pagenos, check_extractable=True):
  File "/home/steve/.virtualenvs/report_to_misp/local/lib/python3.6/site-packages/pdfminer/pdfpage.py", line 132, in get_pages
    raise PDFTextExtractionNotAllowed('Text extraction is not allowed: %r' % fp)
pdfminer.pdfdocument.PDFTextExtractionNotAllowed: Text extraction is not allowed: <_io.BufferedReader name='isr2014.pdf'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "ioc_parser/bin/iocp", line 53, in <module>
    parser.parse(args.PATH)
  File "/home/steve/.virtualenvs/report_to_misp/local/lib/python3.6/site-packages/iocp/Parser.py", line 308, in parse
    self.handler.print_error(path, e)
  File "/home/steve/.virtualenvs/report_to_misp/local/lib/python3.6/site-packages/iocp/Output.py", line 74, in print_error
    print((json.dumps(data)))
  File "/usr/lib/python3.6/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib/python3.6/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python3.6/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/usr/lib/python3.6/json/encoder.py", line 180, in default
    o.__class__.__name__)
TypeError: Object of type 'PDFTextExtractionNotAllowed' is not JSON serializable