Add pdfparser2 module - Githubissues

I created a pdfparser in golang that does everything the existing pdfparser does and much much more, plus its like 30x faster. Details on it can be found here

Usage:

pdfparser -f input.pdf output/

The above command creates the following files in the output dir:

commands.txt - list of commands run by launch actions
contents.txt - the text content of the pdf (can be scripts and contain urls etc.)
errors.txt - list of format errors and abnormalities that we might be able to detect on
files.txt - list of md5 hash and path of referenced embedded and external files. Embedded files are extracted to the output dir using the md5 as the file name.
javascript.js - javascript of all actions in the pdf
raw.pdf - a decrypted and decoded version of the pdf
urls.txt - list of urls referenced by actions

We should create an ace module that scans all the above files with appropriate yara rules. We may also want to add some of the info in the above files as observables, like embedded files, file paths, urls etc

IntegralDefense / ACE

Add pdfparser2 module #246